ArticlesRuby programmingEngineering
Code comments should explain 'why' the code is written as it is. Otherwise, try to let the code document itself.
I use comments sparingly. A lot of comments generally means there’s a lot of un-modeled complexity lurking in the code. Simple example:
sig {params(retry_count: Integer).returns(Payload)}
def make_some_request(retry_count:); ...; end
We are under-specified here: what happens when retry_count is negative, or even zero?
To combat this, lazily, with a code comment:
# `retry_count` must be a positive integer
This code comment explains a constraint but not what happens when you violate it. So someone may say this and add the corresponding special handling:
# `retry_count` defaults to zero if negative
def make_some_request(retry_count:)
retry_count = retry_count >= 0 ? retry_count : 0
# ...
end
But we can do better and avoid most code comments by leveraging types instead:
class PositiveInteger
sig {returns(Integer)}
attr_reader :value
def initializer(int)
if int < 1
raise # NOTE: exceptions like this are bad, this is just for brevity
end
@value = int
end
end
sig {params(retry_count: T.nilable(PositiveInteger).returns(Payload)}
Leveraging the type system is better than comments because we:
Most places where you are specifying what you are best served by using clearer variable names and leveraging smaller, more expressive types to encode invariants.
So when do I actually comment? To explain why. Why are we doing this instead of that, why do we have to call this method and if it fails call it again with different arguments, why is the code structured this way, and why did I get this test failure (in assertion messages)?
Code comments don’t enforce anything, they provide context and that context always needs to be taken with a grain of salt. A good code comment documenting the why helps the reader today figure out what they might be able to do differently if external constraints have changed.
For example, Stripe's email framework restricts reply-to addresses in emails. This is not a technical limitation, but one in service of security and product quality. We don't want replies from users to get black-holed or end up somewhere that is not a proper support channel. Someone can read that code comment, understand the criteria / current posture, and have clear next steps on what to do.
One place you'll see tons of useless comments is in generated boilerplate. It was reasonable when the code was first generated to include a bunch of comments explaining how to use and modify the generated code, but after that these comments are very noisy. It is hard to know what comments were the boilerplate and which were explicitly left by the author.
To combat this in Stripe's email framework, I included a wholesome test to strip boiler-plate comments written by the code generator most engineers use to start writing an email. We used a special Ruby comment syntax:
#### This comment is boiler-plate / generated and should be removed
# This is a comment that should be preserved.
The code-mod to strip the comments is very simple:
def strip_generated_comments(code)
code
.lines
.reject {|line| line.strip.start_with?('####')}
.join
end
Boiler-plate comments from code generators should be removed, they temporarily explain the 'what' for convenience but once you’d seen an email or two in the framework, you’ve seen them ~all.