If you follow the vulnerability world, 2024 is starting to feel like we’ve become trapped in the mirror universe. NVD collapsed, the Linux kernel is generating a huge number of CVE IDs, CISA is maybe enriching the CVE data, and the growth rate of CVE is higher than its ever been. It feels like we’re careening off a cliff in the clown car where half the people are trapped inside trying to get out, and the other half are laughing at the clown honking its nose.

I want to start out by saying all of this is not an accident. A lot of gears have been turning for years, or even decades, we’re seeing the result of trends finally coming together. It was only a matter of time until this happened. Let’s look at a few of those trends.

The size of CVE

The first and most important thing to understand is the sheer size of CVE today. If we graph the CVE IDs since January 1 of their respective years to the time of this writing, we can see the current rate of growth

2024 is more than double the IDs we had at this point in 2021. It’s pretty clear from the graph that the rate of growth is expanding. 2024 will be double 2022 before the year ends. The “let’s just all go back to normal” Very Serious People should probably take note. Normal already careened off the cliff right in front of the clown car, it’s gone forever.

How many vulnerability budgets have doubled since 2021? Probably none. We were not prepared this rate of growth. The best solution we currently have is “thoughts and prayers”.

Even if we wanted to double the size of our vulnerability teams, there aren’t enough people to hire. This isn’t a space that has a huge number of people, much less a huge number of people looking for work. And really, does anyone believe more people could solve this problem?

Before we move on, there’s an aspect of this growth that rarely gets mentioned. If you search GitHub issues and pull requests for some common security terms, such as “buffer overflow” or “cross site scripting”, you get hundreds of thousands of hits just for those two terms. I’m sure this is an unimportant detail we should just ignore while we try to find where normal crashed in the ravine.

The Linux kernel

We just mentioned overall growth, but the Linux kernel is a part of that growth. The kernel is adding A LOT of IDs. Enough to show up on a graph.

The green of the graph is Linux kernel CVE IDs, the blue is all other IDs. It’s a huge number. The kernel is pretty huge though, so these numbers shouldn’t be unexpected.

There are many very serious people who don’t think most of these kernel issues should have CVE IDs. They claim the bugs aren’t bad enough to warrant a CVE, and more verification should be done on the bugs, and they should only be filing things attackers could use to steal bitcoins. There are many complaints about this volume of IDs. The kernel is doing something new and different, so it must be wrong.

Greg K-H explained on the Open Source Security Podcast (which I co-host) that the kernel is used on too many devices to easily decide what is or isn’t a vulnerability. It’s on everything from phones, to space ships, to milking machines. There aren’t many projects that can claim such a wide reach.

Even if we ignore these supposed low quality kernel CVE IDs, there are also a lot of non-kernel CVE IDs that could be considered low quality. What this means is vulnerabilities that could be considered plain old bugs depending on who does the analysis. If we want to raise the bar for what a vulnerability is, that won’t just affect the kernel, it would affect a number of CVE IDs. There are a lot of bugs that are right on the line for what the definition a vulnerability is. It’s sadly about as well defined as what a sandwich is.

This isn’t really a problem though, we’ll explain a bit more about data quality and how we use it below.

The demise of NVD

One of the more exciting events this year was in February when NVD stopped enriching vulnerabilities. We can see from this graph is dropped off pretty abruptly. This dropoff was unexpected and didn’t come with any sort of explanation for many weeks.

Before February, NVD would add data to published CVE IDs. Details like the products and versions affected. Vulnerability severity, and the type of severity. The data wasn’t the best, but sort of like that falling apart car we had as a kid, it was ours and it got us where we wanted to go, usually.

Since this sudden change in enriching vulnerabilities there has quite a bit of confusion about what happens moving forward. For example the FEDRAMP standard specifically names NVD as the source for vulnerability severity. What happens if there is no NVD severity anymore? Nobody seems to know, so we’ll just wait for everything to go back to normal.

CISA has also started to add some enrichment data in a project they have named Vulnrichment. The Vulnrichment project has people at CISA adding vulnerability details similar to how NVD worked. The devil is in the details though. Many words could be written about what CISA is and isn’t doing with Vulnrichment. We’ll wait and see how the projects looks in a few months, it’s still changing pretty rapidly. I’m sure they’re talking with the NVD folks about all this … right … RIGHT?

NVD has made several announcements about the current and future status of the project. We’re about 6 pinky swears deep now that everything will go back to normal soon. I’m sure this time it’s real. The thing is, the truth we need to understand, is that NVD can’t go back to “normal”. Normal fell off the cliff, remember? That CVE graph shows the volume of IDs is growing at an accelerated rate. Unless NVD had a massive budget increase, every year, forever, it’s silly to even pretend.

The unsustainable future

If we look at all the things happening with vulnerabilities it doesn’t spark joy. None of the graphs are pointing in directions we want them to be. In theory, if we had more effective security we would see fewer vulnerabilities every year, not more. But there’s another trend that’s rarely discussed by security people; the mind boggling growth of open source. I gave a talk at Cyphercon earlier this year about the size of open source. It’s gigantic, bigger than you can imagine. Here’s a graph of releases of open source packages

That’s more than 100 million releases in the last 15 years. To put this in perspective there have been 250 thousand CVE IDs ever created, EVER. Any security researcher could look at any piece of software and find at least one vulnerability. I’ll let you think about that one for a while.

Given the CVE graph and now this open source graph it’s pretty clear the trend for the number of … everything is only going to go up for the next few years. If we can’t handle the number of vulnerabilities we have today, how can we possibly deal with double the number in a few years? Oh right, normal, we should go back to normal. Normal will solve our problems.

There are a few things we do today that are making these problems worse. I don’t know exactly how to fix them, but pointing out the problems can help sometimes.

Stop the zero CVE madness

There’s a prevailing attitude in the industry to get the number of CVEs in our environments to zero. This creates a perverse incentive where the goal is a number, not better security. We should try to find ways to discover which vulnerabilities matter and deal with those, not do whatever it takes to make a number on a spreadsheet equal zero.

The size of the numbers in the above graphs means we have to transition out of thinking about vulnerabilities individually and start looking at the data in aggregate. When you have this volume of data there’s no longer a zero, only graphs going up and down. Security and data science need to collide.

Lack of cooperation

There’s virtually no cooperation when it comes to vulnerability data today. If you’re an analyst working on any of this, you’re probably working alone in a dark room filled with old pizza boxes. You might know some fellow vulnerability nerds you can talk to, over discord of course, but mostly, it’s a lonely job. There is a ton of duplicate work happening and we keep reinventing the same square wheels. There are a lot of people working on this data, just not a lot of people working together on this data.

The data quality is terrible

The publicly available vulnerability data is terrible. If you work with this data, this needs no explanation. Until a year ago most of this data was prose text. Think poetry, but very bad poetry, Vogon poetry. It’s somewhat more machine readable as of early 2024, but it often contains errors that are difficult to impossible to get fixed. Entire companies exist just to try and clean up this data then sell it. They are basically selling facts, that’s how bad the data is. This data is like sportsball scores - it should exist freely as the minimum, then build a business around adding more data or insightful commentary.

And one last thing to think about around data quality. If we go back to the complaint that the Linux kernel has too many low quality vulnerabilities that are just bugs, we should also keep in mind that there hundreds of thousands, maybe even millions of untriaged vulnerabilities in open source projects (remember those GitHub searches). While some will be the low quality probably a bug type, some are going to be very critical. Even if it’s 0.5%, that’s a scary number. We’re ignoring these today.

Can we fix it?

We probably can fix this, if we want to, it’s pretty important, but so far we’ve successfully managed to ignore it. It’s not going to be easy. By definition, the people who got us here can’t fix this problem. They would have long ago if they could. Many of the people involved in the vulnerability space, CVE, NVD - they have been doing this work for decades. A bunch don’t think there’s anything wrong. If you talk to any vulnerability analyst, developer, or operations person actually working with CVE IDs, they’re miserable. They get their work done in spite of CVE IDs, not because of them.

Here’s the thing though, the people suffering under the boot of vulnerability management, we’ve never really had a place to discuss, complain, and explore. This is a very closed off world. It feels lonely and terrible when we’re all working in our dark little rooms instead of working together. There’s a lot that could be said around this and there have been some attempts. Much of the attention to date has been focused on the people creating vulnerability data, not the people consuming it and what they need. If such a group manages to emerge, I’ll be sure to spread the word far and wide.

Or we could keep looking for normal. I’m sure it’s around here somewhere.