ArticlesRuby programmingEngineering
Style guidelines and tips I've developed and adhere to when writing Ruby code.
nil?
present?
or blank?
unless
self
use
Hash.key?
Hash.fetch
begin..rescue
blocks
Class#name
This style guide started as a personal collection of fixes for nits, experienced bugs, and poor readability as I wrote and reviewed Ruby code at Stripe.
Stripe has an open source type-checker Sorbet for Ruby which I reference a lot here since the type system is pretty relevant to coding standards. However Sorbet knowledge is not needed for this collection. Sorbet has its own collection if you're interested.
nil?
Writing idiomatic code for any language means understanding its
rules. In Ruby, only false and nil are falsey. Coming from a
JavaScript background, this is awesome (so simple). As such,
it’s actually very uncommon to need to use nil?
:
if !x.nil?
# TODO
end
Can be written as:
if x
# TODO
end
While not strictly equivalent because of false, the use of
nil?
is quite noisy and if you’re reaching for it,
it’s often a sign of primitive obsession where you’re making nil
explicitly mean something.
present?
or blank?
The methods are usually implemented on the base
Object
making them quite infectious. An object is
blank if it's nil, false, empty, or a whitespace string.
Otherwise it is present.
Definitely don’t use present?
or
blank?
when you simply mean to check for
nil
.
These are really only useful for nil-able String values and otherwise introduce much more mental overhead than needed.
The use of present?
and blank?
should
really only be done at system boundaries (processing text
formats or user input) if at all, in order to simplify a String
to nil-able String. After that conversion process we can
leverage Sorbet: Sorbet can help with handling nil-able String
but it can't tell the programmer to account for empty strings in
String.
These methods are anti-patterns because they encourage extreme defensive checking which is less performant and complicates code complexity.
Within well typed code there is almost never a reason for such a
permissive interface of NilClass
,
TrueClass
, FalseClass
,
String
, and ObjectWithEmptyMethod
.
unless
unless x
# TODO
end
# ...is the same as...
if !x
# TODO
end
unless
is longer and harder to reason about. At
Stripe, we already had established lint rules for preventing
complex usages of unless
because it was so
unreadable. Even with single expression unless
it
was still unreadable so I don't advocate its use in any
situation.
Right-hands make it too easy for complexity to hide, especially when reviewing code. Since I’ve stop using them I’ve been less likely to need precondition guards. It’s easier for me to visualize I’m doing too much in one place and avoid it.
return if !x
# ...is the same as...
if !x
return
end
To using right-hand conditionals makes the flow of logic much more readable. Quickly, what is the cyclomatic complexity of this code?
really_long_expression_or_method_call_that_does_something if really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something if really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something if really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something if really_long_expression_or_method_call_that_does_something
What is the cyclomatic complexity of this code?
if really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something
end
really_long_expression_or_method_call_that_does_something
if really_long_expression_or_method_call_that_does_something
really_long_expression_or_method_call_that_does_something
end
really_long_expression_or_method_call_that_does_something
It's much easier to see the branching logic because we can lean on the visual indentation to help us out.
This isn’t too clear, especially when you just start using Ruby but it’s really nice and has useful helpers. Try to leverage them to reduce “procedural noise” in your code. Some examples:
map
instead of each
# Bad
new_list = []
old_list.each do |item|
new_list.push(item + 1)
end
# Good
new_list = old_list.map {|item| item + 1}
flat_map
instead of each
# Bad
new_list = []
old_list.each do |item|
new_list.push(item + 1)
new_list.push(item + 2)
new_list.push(item + 3)
end
# Good
new_list = old_list.flat_map do |item|
[item + 1, item + 2, item + 3]
end
to_h
instead of each hash building
# Bad
new_hash = {}
old_list.each do |item|
new_hash[item.key] = item.value
end
# Good
new_hash = old_list.map do |item|
[item.key, item.value]
end.to_h
And know hash.transform_keys
and
hash.transform_values
are also things!
My team was pretty guilty of this when we all first started writing Ruby. We have some old code like:
module MyPackage::Private::Helpers
PM = MyPackage::Private::Model
PX = MyPackage::Private::Extras
def self.action
PM::MyModel.load('something')
end
end
The right way to write the above is:
module MyPackage
module Private::Helpers # <-- splitting this out lets us skip typing MyPackage
def self.action
Private::Model::MyModel.load('something')
end
end
end
Shorthands hinder readability and are confusing, preferring short term convenience for long term maintenance. Not the best trade-off especially now that its being maintained.
If you see these, don’t continue to to use them. I removed all of these usages so I don’t need this rule anymore; I’ve not seen anyone else do this.
self
use
The self
is implied in Ruby so no need to litter
the code with it when calling methods.
# Bad
class BankAccount
def withdraw(amount)
self.increment(-amount)
end
def deposit(amount)
self.increment(amount)
end
private increment(amount); end
end
# Good
class BankAccount
def withdraw(amount)
increment(-amount)
end
def deposit(amount)
increment(amount)
end
private increment(amount); end
end
Hash.key?
key?
rarely is what you want (and doesn't help with
type narrowing):
{a: nil}.key?(:a)
# => true
Better to just make the access and check the value:
value = hash[key]
if value
# ...
end
(The same applies to Hash.includes?(key)
.)
Hash.fetch
If a value does not exist for a key, a KeyError
is
raised which leads to ugly handling like:
value = begin
hash.fetch(:a)
rescue KeyError => error
# ...
end
Just use the common [key] method which returns a nil-able value that Sorbet can help us handle if it is missing:
value = hash[:a]
if value
# ...
end
Similar to Hash.key?
above, we have the unexpected
issue with explicit nils and default values:
{a: nil}.fetch(:a, 1)
# => nil
In Ruby everything is an expression, take advantage of it! The code is more clear, less noisy, and more oriented towards values rather than imperative statements.
# BAD
if foo
bar = 1
else
bar = 2
end
# GOOD
bar = if foo
1
else
2
end
# BAD
val = nil
if foo
val = 1
end
# GOOD
val = if foo
1
end
With regard to how to name boolean variables, the only
definitive advice to follow is if you define a method which
returns a boolean, write it with a method?
question
mark.
For instance variables, prefer trying be be clear. You might try
to use can_
, has_
, etc to improve
readability, but sometimes you ought not to. For example
is_
doesn’t add much more usually. If used, Sorbet
is there to help with hover-over type information and if you are
(definitely should be) using Sorbet to the fullest this
information will be readily in reach.
You’ll see two ways constants are written:
PascalCase = 1_000
SCREAMING_SNAKE_CASE = 1_000_000
I was in the Pascal case camp for quite awhile, coming from a JavaScript background I found no pleasure in screaming in all my code. However, we do need to standardize and the reason I now lean towards screaming is because Pascal case always has a reserved purpose: classes and modules. As such it’s better to scream constants always so it’s more clear.
For example, a method like:
def self.do_thing(key) # key is a String
# ...
other_thing(key.to_sym)
# ...
end
Instead of taking a String and then converting to a Symbol, just
take a Symbol. You might say, “Ah but I do that so the caller
has an easier time.” But now consider someone had a Symbol,
which is exactly what is wanted by other_thing
,
then they’ll have to do do_thing(symbol.to_s)
and
pay the cost of a double conversion.
In a similar vein, don’t take args which you’ll immediately put into a data structure or do parsing/validation. This allows the caller (which has the broader context) to appropriately construct the inputs and handle any potential errors.
Of course, a nice benefit of not embedding these type conversion is that the code will look much clearer. Someone won’t have read through a bunch of code wondering “why was it done with way?”
begin..rescue
blocks
We’ve all seen code like this:
begin
a = do_this(input.method)
b = do_that(a)
# ...dozens, hundreds of lines of code...
also_this(b)
rescue => error # or `catch` in other languages
error_handling
end
This is just plain lazy, the author didn’t bother to understand what might fail, or if they did they haven’t bothered to share with anyone else. Refactoring and maintaining this code will be a nightmare because we’ve smattered so many potential errors together.
Instead, keep your begin..rescue
cases as isolated
to individual failures as possible. So taking the above and
writing it properly:
value = input.method
a = begin
do_this(value)
rescue => do_this_error
do_this_error_handling
end
b = do_that(a)
# ...dozens, hundreds of lines of code... (but preferably not)
begin
also_this(b)
rescue => also_this_error
also_this_error_handling
end
This is much better!
The next reader of the code now does not have to investigate
input.method
nor do_that
for potential
errors, since working with them in isolation we know they don’t
raise exceptions in any way we intend to handle.
Error handling has better code locality. We’re not jumping back and forth in a huge method trying to see how each method call may map to the result errors below.
It’s now easier to split out and test individual pieces, and start constructing result types to manage these exceptions instead.
dangerous
for things related to security
which may need more scrutiny.
deplorable
for things that should not be done
because they are a bad coding pattern.
I would like to use these keywords to track various code migration burn downs. For example, to get free observability into a method’s decline, team’s simply would need to do a rename and then get all the dashboards for free.
I pitched this at Stripe as project "CodeTrend" but it never moved beyond a proposal.
You can use a third party Ruby gem from anywhere, just require it:
require 'mail'
Except don’t do that! You see Ruby gems shouldn’t really be allowed to be used that way, for a couple reasons:
Most Ruby gems aren’t typed. Sorbet is open
source but most gems are not using it. We mitigate this
slightly using
Ruby Interface (RBI) files
which give us known methods we can call on Ruby gems we use.
The problem with RBIs is they are terrible i.e. mostly
T.untyped
which makes sense because we lack the
type information.
Ruby gems ought not be available everywhere. In Stripe's mono-repo, we’ve got some work to do for packaging up cross-cutting dependencies and we really ought to be tracking Gem dependencies per package instead of being able to loosely load them up from anywhere. Loading up a whole bunch of gems that won’t be used contribute to slow Ruby code.
Ruby gems need maintenance. Suppose 500 places in the code we use a gem and we need to upgrade that gem from 0.2 to 0.6, how bad might that change be? This is really tough to answer because we don’t know how everyone expected the gem to work, nor can we assume everyone’s well tested their use of the gem. (Some gems are huge and we will often only need one or two method from them.) Sorbet typing can’t save us here because some files lack types and behavior changes would go undetected most likely.
So instead, do not use gems directly. Create (well-typed Sorbet) wrappers around the gem in its own module or package, tests those functions for the behavior you expect them to have, and use those interfaces instead of the gem directly.
My team wraps all third-party code in well-typed interfaces, allowing us to completely refactor and replace internals. For example, when upgrading Apple push notifications for the entire company, securing a template language, or moving to a new email provider.
Class#name
In general, class.name
is a foot-gun. Leaning on it
is surprising as people should feel comfortable renaming Ruby
classes without consequence. Using it can break logs, metrics,
code deployed across multiple services, etc which is why stable
IDs are an established pattern in most of my team's frameworks.
Stable IDs are the more durable ID for a piece of code or concept, you can see this pattern in most Ruby DSL’s like in:
class MyThing < AbstractFrameworkThing
config(
stable_id: :my_thing,
# ... other config
)
end
# or
class MyStuff < AbstractFrameworkThing
stable_id :my_thing
end
With these, the Ruby class name can change freely.
"Helper" modules tend to become junk drawers of unrelated code so I tend to only use helper modules in tests as a last resort and if the helper is already used across multiple test files.
If a set of methods can be grouped together, they should have a more descriptive name. Often times the best grouping is around a “policy” or “concern” module.
Similar to helper modules, instead of defining unrelated constants in one big file keep constants relevant to a piece of functionality alongside those files (preferably in the same module).