Programmers sometimes argue about whether comments are good, or bad, or bad but necessary, or better than no comments, and so on. It’s one of those debates where everyone is right part of the time, and nobody really convinces anybody, and then we do it again next month.
Maybe a year or so ago, some colleagues and I were going around this merry-go-round again, and Drew Fisher actually shed some light on the discussion that I hadn’t seen before. He said, “Comments should explain why, not what.” I realized it’s the what comments that I dislike — they are redundant at best, and contradict the actual code at worst (which happens frequently, in my experience). It’s the why comments that I like, and I bet most of us do.
There seem to be at least 3 kinds of comments in software code:
The state of comments in many codebases is not ideal. Often, we can and should
Code examples in this post are in Go, since I work in a Go shop these days.
Here’s an example of a should comment (that also does not explain why):
// RunCommand runs command with the correct arguments for our application. // // The command should be one of "pumpkin", "noodle", or "tootle". func RunCommand(command string) error { cmd := exec.Command(command, "woop", "boing") cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd.Run() }
The command
argument is a shell command, so we can imagine that
the why is that we need to restrict the commands to 1 of a few known-safe
commands — we don’t want to create a shell injection vulnerability! But it’s not
great to make the reader imagine or guess. We should tell them.
This code enforces the should and avoids the vulnerability — it’s safe even if the authors of callers didn’t read the comment:
var SafeCommands = []string{ "pumpkin", "noodle", "tootle", } // RunCommand runs command (which must be one of the SafeCommands) with the // correct arguments for our application. func RunCommand(command string) error { if !slices.Contains(SafeCommands, command) { return fmt.Errorf("%q is not in SafeCommands", command) } cmd := exec.Command(command, "woop", "boing") cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd.Run() }
Explaining why commands must be safe is a good idea, too:
// SafeCommands limits the commands that RunCommand will run. This limit is // necessary because we must prevent shell injection vulnerabilities and // because even without injection, some commands are dangerous to run. Before // you change this list, get approval from the SREs. var SafeCommands = []string{ "pumpkin", "noodle", "tootle", }
This is now pretty good: we’ve replaced a should comment with an assertion that stops bad things from happening, and we explain why it’s necessary to stop bad things and why commands can be bad.
Personally, I’m still not entirely satisfied with this for 2 reasons:
RunCommand
or SafeCommands
change. People don’t always
remember to do this when making changes, and then the comments refer to things
that no longer exist.This code, which creates a type that callers outside the current package can’t construct a meaningful instance of, gets closer to that ideal:
// SafeCommands limits the commands that RunCommand will run. This limit is // necessary because we must prevent shell injection vulnerabilities and // because even without injection, some commands are dangerous to run. type SafeCommand struct { command string } // Before you change or add to these SafeCommands, get approval from the SREs. var Pumpkin SafeCommand = SafeCommand{command: "pumpkin"} var Noodle SafeCommand = SafeCommand{command: "noodle"} var Tootle SafeCommand = SafeCommand{command: "tootle"} // RunCommand runs command with the correct arguments for our application. func RunCommand(command SafeCommand) error { cmd := exec.Command(command.command, "woop", "boing") cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd.Run() }
Callers outside this package cannot construct a meaningful, non-zero instance
of SafeCommand
because all 1 of its fields, command
,
begins with a lower-case letter. In Go, identifiers that begin with lower-case
letters are visible only from inside the package.
Note that I removed the assertion slices.Contains(SafeCommands,
command)
. Now we use the SafeCommand
type to enforce that
commands are safe.
From inside the package, it is still possible to create an arbitrary
SafeCommand
. Depending on the situation, dynamic safety checking
may also still be necessary. Even if it is, I like the inherent
documentation-like quality of the new type name and the named global instances.
If you create a minimal package such that all callers are outside the package,
you may be able to leave the dynamic check out.
This way of doing things is a bit outside the Go cultural norm, but I hope it’s not too far outside.
Type names, variable names, function names, and so on (identifiers) are all documentation. Many or most what comments are obviated by carefully chosen, meaningful identifiers. This is the Javadoc problem: Most Javadoc tediously repeats information that is already present in the identifiers themselves.
What comments are the most likely to get out of sync with the actual code, which is often worse than not having comments at all. Readers easily get confused as to whether the comment or the code is correct, especially in the absence of tests.
I often see long functions broken up into sections headed with what comments. It’s often better to replace the sections with calls to smaller (usually) private functions that do the job, and whose names do the work of the what comment.
Code full of what comments, like this:
func DownloadAndVerifyThing(path string) error { // Build the URL for the thing ... // Fetch the URL ... // Now verify the file ... }
can become:
func DownloadAndVerifyThing(path string) error { url := buildURL(path) pathname, err := fetchURL(url) if err != nil { return error } return verifyFile(pathname) } func buildURL(path string) *URL { return &URL{ ... } } func fetchURL(u *URL) (string, error) { ... } func verifyFile(pathname string) error { ... }
Now, DownloadAndVerifyThing
is shorter, contains less state, and
is more obviously a composition of several tasks.
Depending on the situation, there may be many questions starting with “Why?” that the reader of your code might have. Try to anticipate and answer them. Here are some ideas: