Supplying the supply chain

A long time ago Marc Andreessen said “software is eating the world”. This statement ended up being quite profound in hindsight, as most profound statements are. At the time nobody really understood what he meant and it probably wasn’t until the public cloud caught on that it became something nobody could ignore. The future of technology was less about selling hardware as it is about building software.

We’re at a point now where it’s time to rethink software. Well, the rethinking happened quite some time ago, now everyone has to catch up. Today it’s a pretty safe statement to declare open source is eating the world. Open source won, it’s everywhere, you can’t not use it. It’s not always well understood. And it’s powering your supply chain, even if you don’t know it.

In a previous post I talk about what open source dependencies are. This post is meant to explain how all these dependencies interact with each other and what you need to know about it. The topic of supply chains is coming up more and more and then and it’s usually not great news. When open source comes up in the context of the supply chain it’s very common for the story to center around how dangerous open source is. Of course if you just use this one tool, or this one vendor, or this one something, you’ll be able to sleep at night. Buying solutions for problems you don’t understand is usually slightly less useful than just throwing your money directly into the fire.

Any application depends on other software. Without getting overly detailed it’s safe to say that most of us develop software using libraries, interpreters, compilers, and operating systems from somewhere else. In most cases these are open source projects. Purely proprietary software is an endangered species. It’s probably already extinct but there are a few deniers who won’t let it go quietly into the night.

The intent of the next few blog posts is going to be to pick apart what using open source in your supply chain means. For the rest of this particular post we’re going to put our focus on open source libraries you depend on in your project. Specifically I’m going to pick on npm and containers in my examples. They have two very different ways to deal with dependencies. Containers tends to include packaged dependencies where npm has a more on demand approach. I don’t think one is right, each has drawbacks and advantages, they’re just nice examples that are widely used.

Let’s explain Containers first.

So in the container world we use what’s called a filesystem bundle. It’s really just a compressed archive file but that’s not important. The idea is if you need some sort of library to solve a problem, you toss it in the bundle. You can share your bundles, others can add more things on top, then ship a complete package that has all the important bits stuffed inside in one pretty package. This is mostly done because it’s far easier to deploy a complete system than it is to give someone hundreds of poorly written instructions to setup and deploy a solution. Sysadmins from the late 90’s and early 2000’s understand this pain better than anyone ever. The advantages substantially outweigh the drawbacks which is one of the reasons containers are taking over the world.

The way something like NPM does this is a bit different. When you need a dependency for NPM, you install the dependency, then it installs whatever it needs. It’s sort of turtles all the way down with dependencies having dependencies of dependencies. Then you get to use it. The thing that’s missed sometimes is that if you install something today, then you install the exact same thing tomorrow, you could get a different set of packages and versions. If version 1.2 is released today it couldn’t have been the version you installed yesterday. This has the advantage of getting more updated packages, but has the downside of breaking things as newer packages can behave differently. You can work around this by specifying a certain version of a package at install time. It’s not uncommon to peg the version like this, but it does introduce some of the container problems with outdated dependencies.

My code is old but it works

There are basically two tradeoffs here.

You have old code in your project, but it works because you don’t have to worry about a newer library version changing something. While the code doesn’t change, the world around it does. There is a 100% chance security flaws and bugs will be discovered and fixed in the dependencies you rely on.

The other option is you don’t have old libraries, you update things constantly and quickly but you run the risk of breaking your application every time you update the dependencies. It’s also 100% risk. At some point, something will happen that breaks your application. Sometimes it will be a one line fix, sometimes you’re going to be rewriting huge sections of a feature or installing the old library and never updating it again.

The constant update option is the more devops style and probably the future, but we have to get ourselves to that future. It’s not practical for every project to update their dependencies at this breakneck speed.

What now

The purpose of this post wasn’t to solve any problems, it’s just to explain where we are today. Problem solving will come as part of the next few posts on this topic. I have future posts that will explain how to handle the dependencies in your project, and a post that explains some of the rules and expectations around handling open source security problems.

My code is old but it works#

What now#

My code is old but it works

What now