Runtime type checking

Table of contents

Expression vs confidence
Breaking changes
Checking performance
Types as data
Conclusion

Jake asked my thoughts on Sorbet's runtime type checking. At the same time I've been revisiting Clojure's runtime type checking alternative Spec. So I think this is a good time to flesh out some thoughts.

Expression vs confidence

I primarily code in TypeScript and Sorbet Ruby, for both work and personal projects. TypeScript's type system is quite powerful for a type system. It's rich, has tons of features, nice syntactic sugar, and meshes well with the stated goal of being a bolt-on type system to JavaScript. We don't escape any of the pain of JavaScript, it's just made more bearable by the compile-time type-checking and IDE experience. And that's where TypeScript's help stops.

Sorbet has runtime type checking so it goes further than TypeScript, all the way with us into production (by default). These runtime checks are awesome because they don't lie. TypeScript can be a puzzle game: 'how do I express this invariant in a way that the compiler will enforce it' is a fun exercise but someone will do x as any when the time comes and that puzzle becomes useless to what actually matters: production code. Sorbet is not just Ruby: the runtime type-checking does add additional overhead. Sorbet is a less feature-ful type system, but ~everything it does provide is enforced in both static and runtime scenarios.

TypeScript and Sorbet are juxtaposed on two axis:

Expressiveness of type system: getting to say the most things
Reach of the type system: how much of what is said is true

Sorbet can't achieve a more powerful type system because the runtime type-checking needs to be grounded in reality (types must be class/module/value-based to make checking practical). TypeScript can't provide runtime checks because it is so expressive it cannot translate to a viable runtime checking analog (structured types are prohibitively expensive to check at runtime).

As much as I like TypeScript, Sorbet has the better trade-offs to me:

When I read Sorbet code, I know it says something about production behavior. With Typescript code, I know it says basically nothing.
TypeScript is more expressive, but when you can't confirm it in production it has less meaning. Structured code comments can lead to messes of complexity that don't provide value. With Sorbet things have to be kept simple, within Sorbet's limits or you must annotate where you've exceeded the limits (e.g. T.untyped) and it's clear you're going on your own.

Breaking changes

Adding runtime checks can be a breaking change, so working with existing code that is not typed needs to be updated carefully. Sorbet provides on_failure to do that change management, allowing an evolution:

def get_http_error_from_status_code(status_code); end

# We log when the validation fails and monitor the logs to gain
# production confidence the type annotations are correct.
sig do
 params(status_code: Integer)
   .returns(T.nilable(HttpError))
   .on_failure(:log)
end
def get_http_error_from_status_code(status_code); end

# Confident, we can remove the logging
sig {params(status_code: Integer).returns(T.nilable(HttpError))}
def get_http_error_from_status_code(status_code); end

Contrast this with TypeScript where there is no change management story for breaking changes because the type system doesn't play a role in production. TypeScript is free to continually make breaking changes of the type system and end-developers write whatever contracts they want in a void.

You must either accept that your types are going to mean very little or shoulder a relatively heavy-weight change management process to 'earn' that value.

While you might consider this a ding against Sorbet, think about it this way: you're heavily invested in your type system and all new code has types. Writing new code has no extra overhead change management wise, you get it right the first time perhaps.

For Sorbet users, this means the types are 100% trustworthy when they are used. For TypeScript it means you're just one moment away from someone sneaking around the type system all the time. You're embracing the type system in both cases, but only with Sorbet does confidence compound for your investment.

Checking performance

I just said a lot of good things about Sorbet. Unlike TypeScript, it's not just glorified code comments, it does give us production confidence.

However, there's a bunch of places Sorbet does not give us confidence in production and most of them are due to performance.

Notably, lots of code that executes in the critical path has runtime checks disabled in the code-bases I work in. Sorbet allows this with:

sig {params(x: Integer).returns(Integer).checked(:tests)}
def method_underpinning_everything(x); end

In this case, checks only will run in tests. But sometimes it goes further to no checks at all and sometimes even sig itself is prohibitive.

Sorbet also gives up on checking generic types in some places. Take these examples:

# Fails statically but no issue at runtime.
T.let([1, 2, 'string'], T::Array[Integer])

# This one doesn't even fail statically.
# (As of this writing at least)
T.let({a: 1, b: 'string'}, T::Hash[Symbol, Integer])

An array or hash with an unbound number of elements, keys, or values is too expensive to check and lacks predictable performance. Sorbet is lying to us in these cases, what we really have at hand in production is T::Array[T.untyped] and T::Hash[T.untyped, T.untyped] and very few people internalize this.

Trading off performance, Sorbet erodes that trust I liked so much. In the checked case it's not as bad because it is explicit, but the generic case is something I have and others have gotten burned by. But hey, still doing better than TypeScript which doesn't compete at all.

It was these compromises in type safety that got me really excited about the Sorbet compiler when it was announced. The performance gains of the compilation over interpretation overcome the added cost of type-checking to make the performance vs. confidence gap much narrower.

Types as data

Sorbet's runtime checks require types be represented in some way at runtime. The checks performed are usually is_a?()-like checks but there are other data-like constructs like T::Boolean which is a union type of TrueClass and FalseClass that get represented at runtime quite transparently:

> T::Boolean
#=> #<T::Private::Types::TypeAlias:0x0123
 @callable=#<Proc:0x0456>
 @aliased_type=#<T::Private::Types::SimplePairUnion:0x0789
   @raw_a=true,
   @raw_b=false
 >
>

One thing that's annoying about TypeScript is after you've gotten your types working, it's kind of a reuse dead end. You have to reach for spooky additional tooling like io-ts and throw out those types you just wrote or hand roll your checks yet again.

Sorbet types, because they have a runtime presence, are not a dead-end. We can use them to parse untrusted inputs:

class Person < T::Struct
 const :first_name, String
 const :last_name, String
end

sig do
 params(hash: T::Hash[T.untyped, T.untyped])
   .returns(T.nilable(Person))
end
def try_to_parse_person(hash)
 begin
   Person.from_hash(hash)
 rescue TypeError
   nil
 end
end

Definitely another win for runtime checks. We want types to be tools in our day-to-day work as well, not just code comments.

Conclusion

Runtime type-checking is a more situated problem than compile-time only. You have both stories to manage instead of one. Because of performance and how code executes, the type system offering runtime checking will be more constrained in what it offers and offers with confidence.

TypeScript and Sorbet are defined by their host languages, neither had a clean slate and met their ecosystems where they were. Given a clean slate, I aspire to the outcomes Sorbet achieves more than TypeScript.

Facilitating runtime checks is what is so great about Sorbet. While it is limited expressively, Sorbet sets engineers down the right track of solving problems, not puzzles.

Published 12/19/2022