given enough eyeballs, all bugs are shallow – Linus’s Law
A long time ago we thought Linus’s Law was a real thing and it was why open source was better than closed source. It seems pretty accepted now that Linus’s Law wasn’t ever really a thing. It’s far more likely the reason a lot of open source was pretty good is because the authors were worried someone WOULD look and judge them if the code looked like crap. We all have dark corners of private GitHub repos that are the code equivalent of a festering boil.
But there is something happening right now. And while I love to be overly critical and skeptical of LLMs, this story is about LLMs not totally sucking. No, it’s not about Mythos either, well it might be a little bit, but Mythos doesn’t really matter.
So I’ve been talking to quite a few people about using LLMs to find security vulnerabilities in open source software. And the thing they all keep saying is how literally everywhere they look, they find vulnerabilities. There is no exception if the open source project is large or small. Even the bad LLMs can find things.
I’ve tried this out on a few limited test projects, but I’m not going to let an LLM go wild on something that matters. I have a very strict ethical code about finding and reporting vulnerabilities, which is a huge amount of work. And I’m also extremely lazy, which makes me avoid doing things that are a huge amount of work. It’s one of those immovable object meeting an unstoppable force problems.
But there are some smart people I trust who have been talking about this. The best is probably Daniel Stenberg. He has a recent blog post titled High-Quality Chaos. Daniel is the real deal and I trust his judgement on this. I also know Curl is one of the higher quality projects out there and they are seeing a ton of decent LLM generated vulnerability reports.
Millions of CVEs
In the security world we’ve been saying for years that there probably should be millions of CVEs every year, not thousands. We weren’t just making up crap when we said this. Well, we sort of were, but we were pretty sure we were right, and now it seems we were in fact right. Sometimes there’s nothing worse than being right.
If we do some simple math: there are millions of open source projects. And if an LLM can find vulnerabilities everywhere it looks, that’s millions of CVEs. Not every CVE will be a Log4Shell level bug, but they are actual real vulnerabilities.
It’s also worth making a note here that we can’t really handle a CVE volume that is tens of thousands, so millions is something nobody can actually comprehend.
Linus’s Law
So this brings us to Linus’s Law. It seems pretty clear now that nobody was in fact looking at the code. If they were, they would have found vulnerabilities in everything. But the number of people finding and reporting vulnerabilities was pretty small. It is hard to find security vulnerabilities as a human, but the whole point wasn’t that a few very smart people were looking for bugs, the point was a sort of infinite monkey theorem of bug finding.
It would be easy to proclaim LLMs as our infinite eyeballs, but it’s more complicated than that. While LLMs might be able to find vulnerabilities, the real challenge is going to be reporting and coordinating all of these new findings. Even without an LLM the disclosure process was always a thousand times more work than finding the security vulnerability.
The new version of Linus’s Law should read something like
With enough LLMs, you’re going to be disclosing this stuff forever
The next few years are going to be wild. Anyone telling you they know how to deal with this is full of crap. Nobody knows what to do and this is a human problem, we can’t technology our way out of this.
The future of open source
What will happen to open source is also scary. Not because LLMs are going to rewrite all the open source in Rust (anyone claiming this almost certainly doesn’t know how to program). The thing that’s actually scary comes back to the human problem.
Open source isn’t really source code. Open source is people. And those people are already pretty burnt out.
Now imagine if you’re an open source maintainer on the edge of burnout and you start getting a ton of new vulnerability reports from random people, and all the reports are actually vulnerabilities. Fixing vulnerabilities is a lot more work than fixing other bugs. You need to write an advisory, get a CVE, maybe tell some other companies or projects using your stuff. Discuss the plan with the vulnerability reporter. Then you have to publish all this at the same time you release the new version. It’s a ton of work. Now do that a thousand times.
There are basically 3 possible options to this I can see
- Use your very limited time to stop working on the things you enjoy and only deal with vulnerability reports probably forever
- Ignore the vulnerability reports, YOLO security!
- Quit
I suspect we’re going to see a lot of number 3.
Big Conclusion
We don’t know how to fix this. The way vulnerability disclosure works and the way open source works weren’t designed for this volume of stuff. There will be lots of new projects and companies that will promise to solve this problem. They probably won’t. Open source evolved over time in a very public way. We need many of those same people to help work on this in a very public way that includes everyone from single maintainers, to foundations, to huge companies.
An added challenge is the people who have been playing the part of Cassandra for many years claiming there are millions of vulnerabilities are also very busy now and pretty burnt out.
Maybe with enough eyeballs, all bugs are terrible.