The Register recently published a story titled Putin on the code: DoD reportedly relies on utility written by Russian dev. They should be ashamed of this story. This poor open source developer is getting beat up now to score some internet points. It’s very upsetting.
But anyway, let’s look at some receipts.
If you’re not real smrt, it seems like pointing out an open source project is written by one person in a country you don’t like is a bad thing. It could be. But it also could be the software running THE WHOLE F*CKING PLANET is written by one person. In a country. But we have no idea which country. It’s not the same person mind you, but it’s one person.
Here’s the thing. Almost all open source is literally one person. What I mean by that is if you look at all the open source projects out there, and there are a lot, we see a pattern of one person no matter how we slice and dice the data.
So let’s start with the data. A project exists called ecosyste.ms that catalogs a lot of open source. Most of it I would guess, but not all. They currently have 11.8 million open source projects in their data. You would be right to think that is a big number. I’m told anything over 15 is a big number, but it probably depends how smart you are, or think you are.
So what do we mean by one person is open source. What I mean is if we look at all the projects that ecosyste.ms is tracking, how many have a single person maintaining that project? It’s about 7 million. This is also a big number. 7 million open source projects are one person. It’s actually bigger than that, because of the 11.8 million projects ecosyste.ms is tracking, we don’t know how many maintainers 4 million of the projects have. A bunch of those will be one person. Here’s what a graph of this looks like
I clipped the graph so it looks nicer. There are projects with hundreds of maintainers. Not a ton, but they exist.
Now, the clever people among us are thinking “but Josh, surely these 7 million projects are all things nobody uses, the important open source we all use has loads of maintainers!!!”
You would be right to think that. It’s the first thought I had back when I started to look at this data. It’s OK. You’re still in the denial stage. Hopefully you’ll reach anger by the end of this post.
So we’re going to use the NPM ecosystem to explain this. I use NPM because they have the richest data in ecosyste.ms to explain my point. I’ve done this same thing across multiple ecosystems and the graphs all look the same.
So, what does the NPM maintainer graph look like.
Your first thought is probably “why is the left axis green?” It’s not an axis, it’s the single maintainer number. It’s that huge compared to literally all the other data. There are just that many single person NPM projects.
So now, let’s look at the number of maintainers for projects with over 1 million downloads this month.
This time the graph shows how many downloads projects with over 1 million downloads, and one maintainer or more than one maintainer. It was easier to show the data by creating these two buckets.
That’s almost a 50/50 split. Think about that. About half of the 13,000 most downloaded NPM packages are ONE PERSON. We can change the download number and the graph stays this shape. It’s not until I change downloads to 1 billion downloads that we see 1 package maintained by 1 person, and 9 packages maintained by more than 1.
This is open source. Open source is one person, even the popular stuff.
I will also add, a lot of people own more than one package. So while NPM has over 4 million single person projects, they have about 900,000 maintainers for those 4 million single person projects. This will be an important data point at the end.
So here’s the big conclusion. If you want to make a big deal about something, maybe it shouldn’t be what country a sole maintainer is from. Let’s face it, the Russians aren’t dumb enough to backdoor a package owned by a guy living in Russia. They’re going to do something like pretend to be from another country with a name like Jia Tan, not Boris D. Badguy. This isn’t a Rocky and Bullwinkle episode.
Anyway, back to the conclusion
Open source, the thing that drives the world, the thing Harvard says has an economic value of 8.8 trillion dollars (also a big number). Most of it is one person. And I can promise you not one of those single person projects have the proper amount of resources they need. If you want to talk about possible risks to your supply chain, a single maintainer that’s grossly underpaid and overworked. That’s the risk. The country they are from is irrelevant.
And now if we have news stories being written about how a single person maintainer is the bad guy? That’s not cool (this is where your denial is supposed to turn into anger).
So what can you do about this? How can you turn your newly denial-turned-anger into action? We don’t really know unfortunately. I discussed this in a podcast episode Hobbyist Maintainers with Thomas DePierre. Like many hard problems, there isn’t an easy solution. But I guarantee the solution isn’t hunting down and demonizing single maintainers.