A long time ago Ken Thompson wrote something called Reflections on Trusting Trust. If you’ve never read this, go read it right now. It’s short and it’s something everyone needs to understand. The paper basically explains how Ken backdoored the compiler on a UNIX system in such a way it was extremely hard to get rid of the backdoors (yes, more than one). His conclusion was you can only trust code you wrote. Given the nature of the world today, that’s no longer an option.
Every now and then I have someone ask me about Debian’s Reproducible Builds. There are other groups working on similar things, but these guys seem to be the furthest along. I want to make clear right away that this work being done is really cool and super important, but not exactly for the reasons people assume. The Debian page is good about explaining what’s going on but I think it’s easy to jump to some false conclusions on this one.
Firstly, the point of a reproducible build is to allow two different systems to build the exact same binary. This tells us that the resulting binary was not tampered with. It does not tell us the compiler is trustworthy or the thing we built is trustworthy. Just that the system used to build it was clean and the binary wasn’t meddled with before it got to you.
A lot of people assume a reproducible build means there can’t be a backdoor in the binary. There can due to how the supply chain works. Let’s break this down into a few stages. In the universe of software creation and distribution there are literally thousands to millions of steps happening. From each commit, to releases, to builds, to consumption. It’s pretty wild. We’ll keep it high level.
Here are the places I will talk about. Each one of these could be a book, but I’ll keep it short on purpose.
- Development: Creation of the code in question
- Release: Sending the code out into the world
- Build: Turning the code into a binary
- Compose: Including the binary in some larger project
- Consumption: Using the binary to do something useful
The point of this article isn’t to try to scare anyone (even though it is pretty scary if you really think about it). The real point to this is to stress nobody can do this alone. There was once a time a single group could plausibly try to own their entire development stack, those times are long gone now though. What you need to do is look a the above steps and decide where you want to draw your line. Do you have a supplier you can trust all the way to consumption? Do you only trust them for development and release? If you can’t draw that line, you shouldn’t be using that supplier. In most cases you have to draw the line at compose. If you don’t trust what your supplier does beneath that stage, you need a new supplier. Demanding they give you reproducible builds isn’t going to help you, they could backdoor things during development or release. It’s the old saying: Turst, but verify.
Let me know what you think. I’m @joshbressers on Twitter.