ArticlesEngineering

Language golden paths

Golden paths that programming language designers should incorporate into their core principles

In my experience, great software tends to converge at a few common points:

  • Simple constructs are provided that can be used in any context (building blocks)
  • Compositions of those building blocks is very intentional (explicit opt-in)
  • The least amount of power is used to address problems (bias for data)

These paths manifest in concrete language choices, or in my case restricted ways I use fuller languages in more simple ways. These are their stories.

Private by default

Arguably the most important golden path: enforcing the surface area or interface of any piece of software be small. This is also known as encapsulation, but doesn't need to be related to object-oriented programming.

JavaScript modules are a good example of this:

// Local to the module, not accessible outside
function x() {}

// Exported to be used by other modules
export function y() {}

It's not too much more typing but enough to communicate the explicit choice to make some things public and keep others private.

Compare this with a poor example from Ruby:

module X
 # Public to everyone to be called
 def self.x; end


 # Can only be called within the module
 private_class_method def self.y; end
end

Ruby is encouraging public-by-default by making the private version way more typing. When the more lax thing is easiest to do, it becomes unclear what the author's intent was: did they mean to expose the method to everyone or did they simply forget to close that door?

These differences may seem minor but don't underestimate the influence of defaults, especially with regard to language design. Modern Ruby is always a game of fighting interwoven messes while JavaScript ESM has brought much needed isolated sanity to JavaScript, which also used to be a wild west of over-exposed interfaces. The difference is night and day.

Immutable by default

Simple programs have a clear delineation between data and behavior. Data is passed to pure functions which hold no internal state. Adhering to this principle leads to better code, but as a mere author in someone else's language, achieving pure data can be quite a challenge.

Both JavaScript and Ruby data structures are mostly mutable. JavaScript is a bit better because its strings are consistently immutable. Contrast these languages with Clojure where everything is immutable by default, only mutation is the exception:

// In JavaScript `const` only means
// the variable cannot be reassigned.
const a = { foo: 'bar' }
a.foo = 'baz'
# In Ruby you might reach for a class
# over the plain Hash structure here.
a = {foo: 'bar'}
a[:foo] = 'baz'
; and here's Clojure
(def a {:foo "bar"})
(assoc a :foo "baz")
; Variable 'a is still its original value,
; we've only made an altered copy.


; If we want to make the change mutably,
; we need to invoke additional APIs:
(def a
 (-> {:foo "bar"}
     (transient)
     (assoc :foo "baz")
     (persistent!)))

Assume the language doesn't go as far as Clojure: calling convention is another way to discourage mutability. For example, Ruby often uses (as Clojure does above) the exclamation point ! to denote mutable changes to objects, offering both favors:

a = "Hello  "


# Makes a trimmed copy
b = a.strip


# Mutates `a` in place
a.strip!

Whereas in JavaScript you might consult the documentation to see whether a given method on an object mutates it.

Variable declarations ought to encourage immutability as well:

// (JavaScript) Bad, the more immutable choice is longer.
let x = 0
const y = 1
// Better! The mutability is screaming in your face.
let mutable x = 0
let y = 1

Modular by default

Another key component of good code is the ability to introspect it. In an integrated development environment, we're usually looking to achieve go-to-definition and find-all-references code traversal operations. The editing experience is night and day once we have these operations, so a programming language should lend itself to helping achieve them.

JavaScript modules enable this readily by using import statements:

import { foo } from './other-module'

function bar() {
  foo()
}

Humans and editors alike can find quite easily where foo is sourced from.

Ruby does not enable this on its own:

module Example
 def bar
   OtherModule.foo
 end
end

Ruby is such a dynamic system it doesn't care how code is loaded. All that must be true is that by the time Example.bar is called OtherModule.foo better have been loaded somewhere. Good luck with that one, IDE! And especially you, human.

Without a module loading system in the language, Ruby is left with a lot of options. But at scale there are really only two module loading options:

  • Explicit require "file-path" loading, or
  • Path-based autoloading (ala Zeitwerk)

Explicit requirements are similar to JavaScript imports but suffer from the problem being that "loading a file of Ruby code" and "having explicit access to imported code" is a wide chasm. Loading a file of Ruby code can create namespaces anywhere. It's not really an isolated unit that can be reasoned about as such. It's more akin to C/C++ includes, which is to say not good.

Path-based autoloading suffers the problem of almost the opposite: too aggressive coupling between a global hierarchy (the Ruby namespace) and implementations. Without additional global configuration, every class or module needs to be a dedicated file. This is more akin to Java class files, which is to say not good.

Clojure's code loading is also not modular. Anyone can contribute new methods to an existing namespace and namespaces are loaded based on file system class paths. There is no module isolation. However, at least the imports of namespaces are explicit:

(ns my-namespace
 (:require '[clojure.string :as str]))


(def foo
 (str/join "Hello " "world"))

Clojure makes this namespace problem a bit more manageable by using long namespaces such as com.example.project to make collisions less likely (a carry-over of Java packages).

While this explicit module scope discussion has some overlap with private by default, the core advantage argued here is one of better enabled code navigation and traceability. Private by default is about hiding details, modular design is a focus on what shouldn't be hidden.

Another way you might think of this is: minimize global assumptions and global context. Avoid:

  • Modules which re-define globals for every other context, also known as monkey-patching for good reason.
  • Modules which add to globals. Adding to a collection is fine when there's only one adder, but allowing independent modules to add to globals leads to conflicts.
  • Modules which have side-effect-ful behavior, not just those across code boundaries. Prevent modules from, e.g. making database calls or reading from the filesystem until called upon.

Restrictions such as these set a firm baseline for modular code to be written.

Simple by necessity

If there's something to know about working at the bottom of everyone's stuff, it's that you'll often stumble into saying to yourself, "This is why we can't have nice things." An inch will be given and a mile will be taken in whatever convenience we offer up to others and ourselves. What you can have is only the most simple things and means for others to add conveniences atop.

No truer can this be known than in how we treat data in our programs. Every convenience that is offered about data structures is met with regret.

JavaScript is riddled with niceties:

  • The equality operator == performing implicit conversions.
  • All objects have both prototypical inheritance and associative arrays.
  • Function and variable var hoisting.

Things that were nice at the time to someone but have aged to not be and have gone on to cause thousands of people to learn things they didn't need to ever otherwise learn.

Clojure also falters on having many niceties:

  • Data structures being both function and data.
  • A vast swath of read macros and syntactic sugar.
  • Severely overloaded functions.

And no doubt Ruby has something for everyone in this department. My personal pick is how top-level methods are added to Object. Others might rail on other things.

We need to dissect everything and decide if it is simple and essential or if it is just a nice-to-have. Leave the nice-to-haves to someone else, for that someone to decide what is essential to them. But we don't drag everyone else along just for that problem space.

Interactive by default

Most of engineering is learning by doing. We must enable interactive development and have tight feedback loops to quickly test hypotheses and see our code working.

As systems scale, we struggle to manage the complexity of what we've created. The goal posts change from "get it working now" to "is it working still?" and "how does it work?"

What we need in both these cases is a dynamic system for interacting with our programs. We've lost this spirit out of learned fear, lack of discipline, and lack of tooling but the programming of the future is at things like a REPL and more powerful IDE.

I'll state my biases: JavaScript, Ruby, and Clojure all provide REPL experiences. I can't stand working in languages without them. Iteratively building up a working program has to be the optimal way to construct a program.

What's interesting about an introspectable system is that it requires a special means of interacting with programs that are seemingly incongruent with some of the above golden paths. A REPL often is littered with niceties, encourages jumping into private bits, conflicts with an explicit import system with regard to verbosity, and allows mutation of otherwise immutable programs.

In striving for a decent REPL-hood, compromises are made. And once those compromises are made, where does it end? Why not have the whole language itself be as the REPL it will support. For JavaScript, the REPL is a narrow window into the language. For Clojure and Ruby, there's really no distinction or mode.

How can we have our iterative development and hold to our principals?

  • Immutability: represent the entire program as an immutable data structure. There is no redefining a program, only creating an entirely new program with methods replaced.
  • Private by default & modular by default: retain the boundaries by making imports similarly dynamic and traverse into modules explicitly at the REPL. This would be akin to how in Clojure one can assume a namespace temporarily.
  • Simple by necessity: find the simplest way to support both the language and the REPL experience with the same primitives. (No pressure :))