Comments On Comments

Comments On Comments
2023-8-27 08:0:0 Author: noncombatant.org(查看原文) 阅读量:0 收藏

27 August 2023

Programmers sometimes argue about whether comments are good, or bad, or bad but necessary, or better than no comments, and so on. It’s one of those debates where everyone is right part of the time, and nobody really convinces anybody, and then we do it again next month.

Maybe a year or so ago, some colleagues and I were going around this merry-go-round again, and Drew Fisher actually shed some light on the discussion that I hadn’t seen before. He said, “Comments should explain why, not what.” I realized it’s the what comments that I dislike — they are redundant at best, and contradict the actual code at worst (which happens frequently, in my experience). It’s the why comments that I like, and I bet most of us do.

There seem to be at least 3 kinds of comments in software code:

should comments: that talk about what the code should do, or what callers should do;
what comments: that talk about what code does; and
why comments: that explain why the code does what it does, or why or when callers should call it.

The state of comments in many codebases is not ideal. Often, we can and should

replace should comments with code that checks;
replace what comments with meaningful identifiers; and
explain why more often and more thoroughly.

Code examples in this post are in Go, since I work in a Go shop these days.

Replace Should Comments With Code

Here’s an example of a should comment (that also does not explain why):

// RunCommand runs command with the correct arguments for our application.
//
// The command should be one of "pumpkin", "noodle", or "tootle".
func RunCommand(command string) error {
	cmd := exec.Command(command, "woop", "boing")
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

The command argument is a shell command, so we can imagine that the why is that we need to restrict the commands to 1 of a few known-safe commands — we don’t want to create a shell injection vulnerability! But it’s not great to make the reader imagine or guess. We should tell them.

This code enforces the should and avoids the vulnerability — it’s safe even if the authors of callers didn’t read the comment:

var SafeCommands = []string{
	"pumpkin",
	"noodle",
	"tootle",
}

// RunCommand runs command (which must be one of the SafeCommands) with the
// correct arguments for our application.
func RunCommand(command string) error {
	if !slices.Contains(SafeCommands, command) {
		return fmt.Errorf("%q is not in SafeCommands", command)
	}
	cmd := exec.Command(command, "woop", "boing")
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

Explaining why commands must be safe is a good idea, too:

// SafeCommands limits the commands that RunCommand will run. This limit is
// necessary because we must prevent shell injection vulnerabilities and
// because even without injection, some commands are dangerous to run. Before
// you change this list, get approval from the SREs.
var SafeCommands = []string{
	"pumpkin",
	"noodle",
	"tootle",
}

This is now pretty good: we’ve replaced a should comment with an assertion that stops bad things from happening, and we explain why it’s necessary to stop bad things and why commands can be bad.

Personally, I’m still not entirely satisfied with this for 2 reasons:

I’d like for attempted invocations of unsafe commands to fail early, at compile time rather than at run time.
The comments are duplicative, and will need to be updated if the names RunCommand or SafeCommands change. People don’t always remember to do this when making changes, and then the comments refer to things that no longer exist.

This code, which creates a type that callers outside the current package can’t construct a meaningful instance of, gets closer to that ideal:

// SafeCommands limits the commands that RunCommand will run. This limit is
// necessary because we must prevent shell injection vulnerabilities and
// because even without injection, some commands are dangerous to run.
type SafeCommand struct {
	command string
}

// Before you change or add to these SafeCommands, get approval from the SREs.
var Pumpkin SafeCommand = SafeCommand{command: "pumpkin"}
var Noodle SafeCommand = SafeCommand{command: "noodle"}
var Tootle SafeCommand = SafeCommand{command: "tootle"}

// RunCommand runs command with the correct arguments for our application.
func RunCommand(command SafeCommand) error {
	cmd := exec.Command(command.command, "woop", "boing")
	cmd.Stdout = os.Stdout
	cmd.Stderr = os.Stderr
	return cmd.Run()
}

Callers outside this package cannot construct a meaningful, non-zero instance of SafeCommand because all 1 of its fields, command, begins with a lower-case letter. In Go, identifiers that begin with lower-case letters are visible only from inside the package.

Note that I removed the assertion slices.Contains(SafeCommands, command). Now we use the SafeCommand type to enforce that commands are safe.

From inside the package, it is still possible to create an arbitrary SafeCommand. Depending on the situation, dynamic safety checking may also still be necessary. Even if it is, I like the inherent documentation-like quality of the new type name and the named global instances. If you create a minimal package such that all callers are outside the package, you may be able to leave the dynamic check out.

This way of doing things is a bit outside the Go cultural norm, but I hope it’s not too far outside.

Replace What Comments With Names

Type names, variable names, function names, and so on (identifiers) are all documentation. Many or most what comments are obviated by carefully chosen, meaningful identifiers. This is the Javadoc problem: Most Javadoc tediously repeats information that is already present in the identifiers themselves.

What comments are the most likely to get out of sync with the actual code, which is often worse than not having comments at all. Readers easily get confused as to whether the comment or the code is correct, especially in the absence of tests.

I often see long functions broken up into sections headed with what comments. It’s often better to replace the sections with calls to smaller (usually) private functions that do the job, and whose names do the work of the what comment.

Code full of what comments, like this:

func DownloadAndVerifyThing(path string) error {
	// Build the URL for the thing
	...

	// Fetch the URL
	...

	// Now verify the file
	...
}

can become:

func DownloadAndVerifyThing(path string) error {
	url := buildURL(path)
	pathname, err := fetchURL(url)
	if err != nil {
		return error
	}
	return verifyFile(pathname)
}

func buildURL(path string) *URL {
	return &URL{
		...
	}
}

func fetchURL(u *URL) (string, error) {
	...
}

func verifyFile(pathname string) error {
	...
}

Now, DownloadAndVerifyThing is shorter, contains less state, and is more obviously a composition of several tasks.

Explain Why Thoroughly

Depending on the situation, there may be many questions starting with “Why?” that the reader of your code might have. Try to anticipate and answer them. Here are some ideas:

Why is it necessary?
Why is it implemented this way and not in some other way?
Why is it so complicated? Are all these special cases/fancy algorithms/complex data structures necessary?
Why is it so simple? Don’t we need high efficiency/to handle more error conditions/to be more generic/et c.?
Why is it here and not somewhere else?
Why is it important to fix/refactor/change/not change?

文章来源: https://noncombatant.org/2023/08/27/comments
如有侵权请联系:admin#unsafe.sh