ArticlesEngineering

Interface Design: Security

Thinking and examples of security first API design.

Stripe's number one priority is always security. So over the years I've gone from not worrying much at all, to always thinking security first. This has brought a lot of nuance to how I design APIs and think it's worth sharing and reflecting on exactly how that looks.

Progressions

To effectively think security first, you have to start with threat modeling, figuring out what you actually care to secure today and what you will want to secure next.

Zero security is eventually bad, too much security and nothing ever happens. Great security is about carving out the right postures for the business, accepting risk where appropriate, and over time drawing stronger lines on what is table stakes for keeping the business operating and growing safely.

Security-first interfaces fit within this framing. You will have many possible interfaces and the hardest choice on which to choose will be easiest when informed by a threat model.

The only counterweight to designing strictly by your threat model is designing for change. Sometimes even if the posture is not strong enough, it will be worth it to over-optimize just to save time later. Because it is a progression: if everything goes well, security will get stronger over time. So as a provider of an interface, designing for change and a bit into the future to save your future self time is totally cool. These incremental over-wins really do get balls rolling and become self-fulfilling prophecies in the right company.

Example: email headers

Email headers are like HTTP headers: key-value pairs that are included in each email. Some common headers include Subject, To, and From. When I joined Stripe, the API for specifying email headers was:

send_email(
  from String:,
  to: Array[String],
  subject: String,
  html_body: String,
  custom_headers: Map[String, String]
)

This is alright, it did do the job for many years. In terms of being abusable, we were doing pretty good. For example, if our interface were instead:

send_email(
  message: String
)

And we put the onus on all callers to construct email headers and compile the full message themselves, we could not prevent content injection and escaping issues caused by malicious user input. With the Map[String, String] interface we can do that proper escaping of headers for our callers. Luckily, most email building libraries are oriented around this style of building up message bodies, so this interface wasn't actively informed by a threat model.

Interestingly, you'll find weird ambiguity with the Map[String, String] interface. For example, what might render out when we do:

send_email(
  ...,
  subject: 'Hello',
  custom_headers: {
    'subject' => 'foo',
    'SUBJECT' => 'bar',
    'Subject' => 'baz'
  }
)

Headers are case insensitive so how are we going to resolve this, what takes precedence? Surely the dedicated subject: param should always be used, right? Like minded engineers may disagree on the role custom_headers should play.

We can apply a threat model now:

  • What do we care about?
    • All emails should not arrive completely broken to end-users. If we weren't applying proper escaping to email headers, we could not say this. This is important to us as we need to maintain trust with our users. They trust us with much more important things than email, so we have to secure and be perceived to be securing email reasonably as well.
    • All emails should not serve a purpose beyond the company's intent. Within the company we trust each other to deliver an email but don't have that same trust in bad actors. Email headers can control presentation in ways we don't intend and can be abused by malicious users.
  • Brainstorm: What are some example situations?
    • Malicious input breaks our escaping and emails render out broken.
    • Malicious, but valid, custom headers do some weird thing for some email client.
  • How do we want to respond when that happens?
    • No response; make it impossible from the outset
    • Find and disable the vulnerable email
    • Find the vulnerability and make a code change to fix it
    • Do nothing, it's not worth our time

Where we drew the line in this case is, "Any email header we send must be intended by an engineer." The misuse of any value of an intended header we can mitigate by turning off an email or making a code change, but given how varied email clients are in interpreting a multitude of headers we can't know all the headers are safe to pass along.

Cool! Now designing the interface is easy: Map[String, String] is too powerful. An engineer could be easily forgiven for passing an arbitrary map of random, unintended headers if they were given that interface. Instead we need a fixed set of known headers for engineers to use and add to over time. This interface looks like this:

send_email(
  ...,
  custom_headers: [
    EmailHeader.in_reply_to(
      message_id: '<foo…>',
    ),
    EmailHeader.reply_to(
      address: ‘me@example.com’,
    ),
  ]
)

And will yield these headers in the final email:

Reply-To: [email protected]
In-Reply-To: <foo…>

Now if someone wants to use a new header, they need to add a new header method to EmailHeader. At a glance we can go into the code and see every custom email header we could possibly be sending by looking at EmailHeader. If you have a static type system, you can find each header's usages in the code as well. We have our fixed set of email headers.

This interface also solves our case-insensitivity problem because header names are no longer a part of the interface. We do still struggle with ambiguity; what renders when we do:

custom_headers: [
  EmailHeader.in_reply_to(message_id: '<foo…>'),
  EmailHeader.in_reply_to(message_id: '<bar…>'),
]

So that still needs to be worked through with a well-defined behavior or runtime check if the type system isn't powerful enough to describe unique sets.

The other cool thing about this interface is how bespoke we can get in terms of specifying header names and values. An engineer might design a single EmailHeader method that sets multiple headers and takes well defined values. It can also be well typed and tested in one place instead of having email implementation details bleed all over as String manipulation.

I think email headers are a particularly interesting example to demonstrate progression:

  • They are conceptually pretty simple and have a bunch of concerns in the details.
  • No interface is right or wrong unless you apply a threat model.
  • Implementing EmailHeader over a simple Map[String, String] is a pretty low cost investment, but can have so many advantages if you want them.

Example: email addresses

Wait, addresses are values in email headers, right? We just had a whole section on those. This is true, but headers for email addresses are special.

Custom headers at their worst can, through exploits in escaping or overriding of headers, cause emails to send to someone else by fudging the address headers up. But after preventing that, they can really only break presentation.

Address headers need special attention because of authorization, reputation, and data exfiltration. To threat model again:

  • What do we care about?

    • Emails should only be sent to the intended individuals or trusted parties. We need to send the email to the correct audience. Tangibly, sending the email to the wrong place can get us marked as spam and make it harder to reach our users in the general case.

    • We need to keep user data entrusted to us secure. Emails can contain user data so sending it to the wrong place or to someone who shouldn't see it is really bad. Once an email is sent, we cannot claw it back; the damage has been done.

  • Brainstorm: What are some example situations?

    • User Bob gets an email intended for user Mary.
    • Bob receives an email, but so does someone who shouldn't on CC or BCC.
    • User data in the email copy, in attachments, or unauthenticated links is accessible to bad actors.

This is really serious! It's important to call out that sometimes the best interface is no interface. For example, if someone wanted to send a confidential document as an email attachment: heck no! We would instead send an email with a link to authenticate and then download the document. We don't trust email as a medium for all communications to users.

In terms of an actual interface however, we have designed to make addressing safer.

Richer addressing

Easiest way to deal with addresses at the interface is to not deal with addresses. Instead of:

send_email(
  to: Array[String],
  cc: Array[String],
  bcc: Array[String],
  ...
)

We can meet developers where they are and do the heavy lifting for them:

send_user_email(
  user: User,
  ...
)

And figure out all the proper addressing under the interface. This is really powerful in many ways. We've captured a much higher level intent of "send to this user" as opposed to "send to this address."

We provide more than this interface so people can do their jobs: we don't always have a user. So we generally have two APIs: the low-level one to more closely match the wire protocol and the one 99% of developers use that is tuned to business domain concerns.

The richer we can specialize for the User scenario, the more context we have to evolve how we communicate over time. No one really cared to have to type send_email(to: [user.email]) over and over anyways.

A great benefit of having the richer interfaces is observability. A user's email address can change over time but their ID is unique, durable, and safe to log. Email addresses are personally identifiable information (PII), which we don't want to log all over. There's also familiarity: everyone is already working with user IDs in their business logic.

Limited addressing

To avoid problems with users being overshared data they should not see, we opted to scrutinize having multiple To addresses, and whether we even needed CC and BCC addresses at all in our higher level interfaces. As it turned out for us, yeah they are not really needed.

We do have use cases for CC addresses, but given how uncommon they are we've been able to manage them behind a private interface. Tied to using this interface is a well defined criteria for when it is okay to use.

Limiting emails to "one per recipient" makes modeling much easier and allows us to ensure each individual send is getting the same authorization checks. It also helps answer some product quality questions such as, "If Bob wants his email in Spanish and Mary in English, what do they get?" Each recipient gets their own email in the language they want it.

So that's addressing. It's another pretty cool example of progression:

  • Oftentimes platforms and infra teams can think their only job is to provide a library matching the wire protocol and they're done. In this case we were able to cut out a lot of that complexity by providing interfaces that better matched developer intent. But it happened gradually, we offered everything and then trimmed away what wasn't necessary by applying our threat model.

  • Sometimes the risk is too high and no interface is the very best interface. As much as we iterate on interfaces there is a limit to what we can encourage and prevent by having one at all. We used to support BCC, but now we don't at all.


Ugh, there's so much more to talk about. But this is enough for now. If you're interested in more API design, check out my other article Interface Design: Developer Experience and let me know if you'd like to read more on Twitter or shoot me an email (with headers).

Go create a threat model and make an interface that satisfies it (maybe better than needed)!

Thanks for reading!