Brian Fox discusses the challenges and future of open source package repository infrastructure. We discuss the complexities of managing public registries, the impact of overconsumption, and the importance of sustainable practices in the open source community. Brian tells us how organizations can reduce their footprint and contribute to a more balanced ecosystem. The package repositories cannot continue to be the world’s CDN.

This episode is also available as a podcast, search for “Open Source Security” on your favorite podcast player.

Episode Transcript

Josh Bressers (00:00) Today, Open Source Security is talking with Brian Fox, the co-founder and CTO of Sonotype, and a good friend of mine. So Brian, welcome back to the show, man. It’s a treat.

Brian Fox (Sonatype) (00:08) Yeah,

yeah good to be back here.

Josh Bressers (00:10) And we’ve

got a doozy of a topic today, like holy cow. So you are part of a group that wrote an open letter as feel free to just jump in and correct me as I get any of this wrong. ⁓ an open letter, is it, is it, was it published by the OpenSSF or through the OpenSSF? don’t know kind of the detail there.

Brian Fox (Sonatype) (00:14) Yeah. Yeah.

Yeah, mean, it unfolded in a weird way, but yes, it was published on the OpenSSF blog. ⁓ They signed on at the end, it is a group of ⁓ basically all the public registries.

Josh Bressers (00:37) Okay.

Brian Fox (Sonatype) (00:47) Maven Central, ⁓ PyPI PHP. heard Perl is signing it now after the fact, ⁓ which is great. ⁓ Alpha Omega, OpenSSF, OpenJS, Eclipse Foundation, right? And so we kind of decided that a good neutral spot was the OpenSSF to publish it. But there was a lot of good reception there, so they joined on ⁓ as official signatories as well. Yeah.

Josh Bressers (00:54) Nice.

Awesome.

That’s cool.

And the title of the letter is an open letter from the stewards of public open source infrastructure. And I feel like the titles and names you just listed aren’t the stewards of open source infrastructure. They are the foundation of modern society. Where like any of these organizations disappeared, civilization collapses.

Brian Fox (Sonatype) (01:35) At least for a couple weeks or months while people pick up the pieces. Yeah. Yeah, some of them could be a mess, right? So yeah, what are we talking about here? I mean, we’re talking about basically all of the

The open source and other, and I’m sure we’ll come back to the other, but all of the open source supply chain for all the modern languages are signed on here, right? Because we all share ⁓ the same problem of over consumption of these free resources that are meant for open source.

Josh Bressers (02:11) Okay, let’s, I want to dig into that because I feel like there’s maybe a misunderstanding of sometimes the concept of open source. So there’s open source, the licenses, and then there’s like open source, this infrastructure, which you are part of Maven Central and you listed off a whole bunch of other repos. But fundamentally, these are volunteer organizations that are

hosting an enormous amount of data and quite frankly, they’re running the world. I mean, you talk about this in your letter, is there are, you’re the CDN for the world, functionally speaking, right?

Brian Fox (Sonatype) (02:50) Yeah, yeah. mean, so if you think about open source, you know, historically, right, it started as, we’re writing software, which is words, basically, at the end of the day, we’re publishing those and we’re making them available for other people to do what they want with, whether they want to extend them, copy them, do all these other things, right? That’s like the purest form of open source is literally I’ve made the source code available. ⁓

But starting, I think Maven credibly was one of the first that kind of started doing this. We started providing the actual built binary artifacts as a convenience artifact, which then very quickly became the de facto artifacts, right? Like in modern open source, ⁓

people don’t really download the source code and rebuild these things from scratch. that, I used to have to do that in the super early days of open source, but it was hard trying to figure out how to build somebody else’s stuff is hard, right? And so I would say that the rise of binary packages across all these ecosystems actually contributed to the boom of open source that we saw over the past couple of decades, right? It became easy to include these things. And then you think about package,

packaging tools like Maven and others that can deal with all of the transitive dependencies. So you pull in one thing and it pulls in 10 more. That can be hard to manage. 10, 100 if it’s JavaScript, 1,000 or 10,000 other things. And so doing that automatically from these registries is what actually makes it possible to rapidly consume and build on top of

Josh Bressers (04:20) Nothing pulls a 10, Brian.

Brian Fox (Sonatype) (04:37) open source and that’s all great. That’s what these registries were built for to make it.

possible to have a place to share the metadata and the packages themselves for consumers to be able to use. And that’s all great. ⁓ But what we’ve observed is over time, the usual psychology of humans of overgrazing the fields kind of starts to come into play. And I’ve been on this journey for about a year, at least, in terms of trying to bend the curve of the growth ⁓ of the consumption. ⁓

Josh Bressers (05:00) Yes.

Brian Fox (Sonatype) (05:14) every year the growth of Central in terms of downloads looks like a hockey stick. So it’s like fractal. Every year it’s the same shape, it’s just bigger. ⁓ And that’s true of all package registries really. But when I started looking into it, and I…

I think we probably talked about this on a podcast sometime last year. You know, we found that ⁓ it was less than 1 % of the global IPs were driving more than 80 % of my traffic, right? So it was extreme at the high end. And so we started implementing rate limits and trying to calm the traffic there. that created some interesting learnings. And then we started looking at organizations that instead of from a single IP number, they were coming from dozens or in some

cases multiple of thousands of IP numbers distributed across a single organization and these are giant commercial organizations. Let’s be clear these are not open source.

Publishers developers, you know working on these things. These are organizations who their commercial software is built on top of this stuff and and you know, I found some instances of organizations that were redownloading, know sets of 10 to 20,000 jars Redownloading them on average half a million times a month over and over and over again, right and in Maven the jars don’t change It’s you know, it’s it’s canonical. You can only add new things

change things there. So if you’ve downloaded it once you don’t need to download it 499,000 more times. is, I promise you, it is the same. And so, you know, we started putting some limits in place to try to calm that traffic as well.

And what I started to uncover, and I kind of chronicled this in a series of blogs, I started to uncover systemic challenges where it wasn’t usually the case that there was intentional abuse going on. was decisions made by others in the ecosystem, other build tools, other security tools, just were like, oh, Maven Central is available. It’s highly available. It’s super fast and it’s free. So why should I implement any caching for my tool? That’s hard.

lean on this stuff.

And so what we’ve uncovered is, you know, like I said, the usual thing is when things are perceived to be free and infinite, it changes the economic decisions that the consumers make. It’s literally the tragedy of the commons kind of problem. And in talking to my peers at other ecosystems, I realized they are all in kind of the same situation. ⁓ You know, and certainly if you’re a not-for-profit and your funding all comes from,

from donations and memberships.

⁓ those are hard to maintain, right? But why would these, why would you join one of these open source foundations merely to pay the bandwidth bill for a bunch of maybe bigger organizations than you just doing things without cash? Like that’s not exciting to anybody. And yet that’s fundamentally what’s been happening here. And, you know, the open source ecosystem’s ability to attract ⁓

Josh Bressers (08:17) Right.

Brian Fox (Sonatype) (08:30) know, donations is kind of a zero sum game. So if we’re taking this money and applying it to infrastructure, basically paying for various forms of abuse, by definition, means that money is not available to help with the maintainers. Right. And so.

You know, this conversation around infrastructure is actually kind of a new one. And that was sort of the epiphany I had back in May. You know, we’ve all been talking about raising money to pay the maintainers forever. And it hasn’t been terribly effective, but we never talk about the infrastructure underneath these things. ⁓ And I do think there’s reason to be optimistic because these things are not optional. Organizations worldwide are dependent upon them. And if we approach

regulating this a little bit better. It might actually provide some money. It’s certainly going to free up money that’s currently going to donate to these infrastructure pieces, but ⁓ we might be able to produce extra streams that can in fact go to pay the maintainers. So I think it’s all interrelated, but we haven’t been talking about this part of the problem.

Josh Bressers (09:36) Holy cow, that’s a lot

to unpack, man. I… Okay, so I’m gonna put a link to the letter in the show notes. But as I told Brian before I hit record, like I went to take notes on this letter and I’m like, the whole letter is the note. Like, it’s very, very dense. But I think that’s okay and I think this is an important problem. And I think there are a couple pieces in here that really stuck out to me as kind of like an ⁓ moment.

Brian Fox (Sonatype) (09:53) Mm-hmm.

Josh Bressers (10:06) And the first is the proprietary software distribution angle, which I hadn’t really thought about before, but there are, and this is so true though, right? There are proprietary services, web services, many of us use, that have like SDKs in these public repos, but those SDKs only work with their service. So it’s not like it’s some sort of reusable open source, which you point out in the letter. And like that…

Brian Fox (Sonatype) (10:06) Mm-hmm.

yeah.

Mm-hmm.

Josh Bressers (10:32) Man that feels so weird now that someone says it out loud and I hadn’t really put that together prior to this like

Brian Fox (Sonatype) (10:40) Yeah, it’s funny because some of this comes, you know, sort of sur surreptitiously. You know, we were we were doing a big migration of the publishing mechanism on Maven Central over the summer and the team was overloaded with with migration related issues. So I I jumped in, you know, to help. I was doing front line support ⁓ on Maven Central, which surprised a lot of people. But, you know, I’m the kind of guy I’ll get in the boat and row with you if we need to. ⁓ And and what I observed was like, there’s a lot of

support tickets coming in here from very angry people with commercial signatures in their email.

Josh Bressers (11:14) Hahaha

Brian Fox (Sonatype) (11:15) And I didn’t think much of it at first. And then I started to really pay closer attention. And I was like, wait a minute, why are these companies complaining about the validation rule that requires their source be open? They’re saying, I don’t want to publish my source. And I’m like, hold on, this is an open source repository. So let’s take a little bit of a closer look. And we found that there were cases where people were creating empty source components just to trick the validation. ⁓

pulling on that thread a little bit. And I found other instances where, you know, the downloads were almost all from browsers and not from the package tools. And it’s like, wait, so I Google that URL and I find like, right, this commercial entity is using the link on their website to download their software goes directly to Maven Central. Right? So that’s literally the free CDN. Like, why would you do that? Well, because I’m not paying for it and it’s free. And so that’s the, that’s the kind of the,

Josh Bressers (12:03) Holy crap!

Brian Fox (Sonatype) (12:15) the sort of side abuse that I’m talking about. Each one, most of the time, is small, but in aggregate, it’s quite large. And if we look at the number of easily identifiable commercial publishers across just my package ⁓ registry, we’re talking tens of thousand.

Right now, some of these might be publishing real open source projects. Some of them are publishing SDKs that are quote unquote open source, but like you pointed out, really only work with their service. So they’re intricately tied to a paid service. And then there’s some that are just clearly like commercial binaries that aren’t even open source at all and are still kicking around. so trying to figure out how to balance that is a challenge because like the CRA, it’s hard to define

what is commercial, right? So I’m not claiming to have cracked that nut yet, but I have observed that this can be part of the solution instead of part of the problem. ⁓

Josh Bressers (13:05) Right.

Brian Fox (Sonatype) (13:14) But the other interesting thing is sort of these tools that have depend upon it. And we just cut over central ⁓ to a new CDN provider at the end of September. So two days ago as we’re talking about this now. And we moved a petabyte a day of traffic from one CDN to another overnight. And shockingly, there were very few problems. But there was one really big problem that we had no.

Josh Bressers (13:36) Whoa.

Brian Fox (Sonatype) (13:44) way of anticipating. Cloudflare has much better protections for denial of service and bots and things like that, which is why we are excited to do this, because it can help us get control over all of these costs. Well, it identified what appeared to be a denial of service attack because there was a bunch of stuff coming in with unusual user agents, which happened to be Python.

Josh Bressers (14:04) No.

Brian Fox (Sonatype) (14:05) Now think

about that for a minute. Maven Central is about Java. So Kotlin, Gradle, Ant, Maven, these are all the things you expect. You don’t expect Python code to be downloading Java dependencies. And the tool automatically flagged that and blocked it. So I wake up the next morning and I have a whole bunch of emails from, I’m gonna not name names here, because my goal is not to name in shame, but use it as an example, from an open source project.

⁓ asking what was going on and I said, okay, we see the problem. Help me understand your use case here. I don’t really understand why this, what’s actually happening here. I check our support portal and we have a whole bunch of support tickets from a whole bunch of companies that are much, much larger than us. ⁓

Josh Bressers (14:57) No.

Brian Fox (Sonatype) (14:57) asking

why this infrastructure is broken that I have no understanding of. Well, come to find out what’s actually happened is an open source project that deals with processing big data and AI-based workflows ⁓ had hard-coded some URLs into their Python scripts for fetching dependencies in an operational context. So every time a pipeline for processing data

Josh Bressers (15:18) ⁓

Brian Fox (Sonatype) (15:27) started up on a worker agent, was re-downloading the same components over and over again, and these were inadvertently blocked, But what I then found out was that there was another global player who had built this thing into a very big service that they provide to the world.

that was also dependent upon this download. So basically all of their big data customers were also getting, getting, you know, inadvertently messed up on this. And as we unpack this, I’m just like, guys, you realize how ridiculous this is, right? All of your users and you are orders of magnitude bigger than we are. Why am I paying the bandwidth bill for your jobs to fire up? And, and, and worse, like some of the other cases I had documented in some of my previous blogs, there was literally

Josh Bressers (16:06) Right. Right.

Brian Fox (Sonatype) (16:16) no way for the users to do anything about it. It was hard-coded into the code. So we’re still in this situation, by the way. We figured out the problem and we allowed it temporarily, but they have to update the code in a fail-forward mode and wait for everybody to migrate before we can do anything about this.

And so this just kind of highlights what’s happening here. Again, somebody looked at this and said, these things I want, they’re available in central. So that’s the URL I’m going to include in the system. And now we find ourselves in a situation where a ginormous cloud company is inadvertently dependent upon ⁓ what was designed to be distributing open source for builds. Like we understand the operational importance of building software very well, but we don’t have this thing set up to be

serving millions of serverless type of agent downloads ⁓ at hyper scale. That just doesn’t make any sense.

Josh Bressers (17:13) Wow, wow, okay.

Brian Fox (Sonatype) (17:16) Right, where do you go from there? Right, and by the way, I think this, I know this exists in all the other ecosystems, know, Eclipse is seeing some challenges with their OpenVSX, right? They created a marketplace for visual ⁓ VS code. ⁓

Josh Bressers (17:16) That’s so ridiculous!

Brian Fox (Sonatype) (17:35) you know, things for all the forks of VS code. Well, guess what? All of the AI based tools are based on they’re based on VS code, right? So my understanding is the rise of the agent, you know, the agentic development and all of that kind of stuff from all of these new popular tools is is crushing their infrastructure. And they’re sort of sitting there saying, why, why should our members be paying for this? They’re not the ones consuming it. They’re not the ones producing it. And they’re not the ones build building tens of billion a dollar businesses on top of this infrastructure.

yet they’re driving all the costs. So it’s very much the same problem across the whole ecosystem.

Josh Bressers (18:14) Yeah, yeah. mean, had Deb Nicholson on a couple episodes ago talking about the Python Software Foundation. And yeah, this was completely the point. And right now, as I suspect you’re aware, is a lot of the donations and membership to open source foundations has been dropping off just due to like, you know, global economic situations, which makes it even harder. Okay, so I want to get to the kind of what can we do part, but before we do…

Brian Fox (Sonatype) (18:22) Mm-hmm.

Absolutely.

Josh Bressers (18:44) I would value your thoughts and opinions on. So I feel like we’re reaching, I think in your letter, you say like, we’re not yet at a crisis, but there is definitely a certain amount of, guess, awareness and inner reflection on open source of like, everyone is starting to notice like, holy crap, like everyone is using us and it doesn’t feel like.

they’re using us as a friend, they’re like sneaking into our house and stealing our stuff to a degree, right? Where it’s almost, don’t know the word to call this, cause it’s not like, it’s open source, we made it, we made it free, we knew how this was working, but it’s definitely starting to feel like there is a scale tipping in the direction of the open source universe saying like, this is not fair. I would like.

Brian Fox (Sonatype) (19:17) Mm-hmm. Mm-hmm.

Josh Bressers (19:39) Am I off my rocker here? What do you think?

Brian Fox (Sonatype) (19:40) Yeah.

No, no, you’re right. you know, and it’s sort of ⁓

historically, in my own case, before Maven Central was on a CDN, it relied on donated bandwidth from iBiblio, University of North Carolina, right, back in the day, and it was kind of unreliable. So, conveniently or coincidentally, that’s how I got involved in Maven in the super early days. I wrote some plugins so I could just make sure all my dependencies were downloaded so I could go get some coffee when my build ran and not come back and find it failed halfway through.

and caching proxies were like a mandatory thing to do anything real back in the day.

as we made the infrastructure more robust, think more organizations have moved away from that. So these are simple win, win, win situations. Like these caching proxies will make sure that your organization downloads the things you need once, only once. So it saves the registry time. It’s actually going to speed up all your builds because fetching them, you know, from either an on-prem instance in your build office or in the cloud next to your infrastructure makes all of it faster. You’re going to pay less CPU time for your builds, right?

Everybody wins if you actually do it and most people don’t do it because they take the easy way to just code in that URL. so ⁓ there have been solutions to this problem all along and we’ve been worse at using them. That’s part of the challenge here. But I don’t think any, none of the package systems are ⁓ really at the point of saying like,

We’re at risk of being bankrupt right away. That’s not what we’re saying But we’re looking at the thing and saying right we’re looking at the growth and saying listen this can’t continue This is going faster and faster and the rate of growth is accelerating, you know And especially the case I talked about with AI You know it it used to be in some cases bounded by how fast you could put humans in seats Which had a natural limiting factor, but if all of a sudden these things can be You know spun up I can spun it spin up 10,000 agents in five

Josh Bressers (21:27) Well, not yet.

Brian Fox (Sonatype) (21:53) minutes that simulate 10,000 developers, that’s great, except if all of them are provisioning ⁓ OpenVSX and going to download Python and grabbing stuff from Central times 10,000, the actual growth now does not have any natural limits to it in that regard. it could accelerate this problem even faster. And the challenge that I observed in talking to various folks about it is there was a sensitivity

to the communities that if we start making changes here, if we start putting limits in place, we start asking commercial publishers to pay their fair share, start asking huge consumers to pay their fair share, finding other ways to produce income streams to help pay for this, that the open source community might really rebel against that. And so there was a hesitancy, like everybody understood that a problem was needed, but everybody was sort of afraid to talk about it publicly.

due to backlash. Even when I started rolling out some of the initial rate limiting, I was a little bit worried about ⁓ a freak out. I didn’t get one, which was nice. But it’s always that fear. And so what we realized is like, listen, we need to be able to normalize these elements to help support this infrastructure. We need to be able to talk about these problems and have people go, right, it doesn’t make sense for sonotype to pay the bandwidth for ⁓

cloud

scale integrations and all of their much bigger customers. Like that just doesn’t make any sense here. ⁓ And to do it together so that the ecosystems don’t pick each one apart and accuse people of being bad actors, as you know, as well as I do, that has happened in many cases, right? And so, you know, a lot of people, when they read the letter, they were sort of clamoring for like, we want you to tell us what’s next.

Josh Bressers (23:30) Yeah. Yeah.

Yeah.

Brian Fox (Sonatype) (23:53) what exactly to expect and our answer was, know, listen, we’re not trying in this first step to prescribe what the answers are. We’re not trying to say that every ecosystem is going to solve this problem the same because we all have different history. We all have different, you know, at the low level, the way they operate, the way things they can and can’t do just technologically or politically within their ecosystems is very different. The problem is the same. And so what we really wanted to be able to

is go on record of explain what’s happening, what the background is, explain that it is applying to all of these ecosystems more or less equally. So then when the ecosystems start to implement some of these changes, we can kind of point back to this as the body of work to say like, listen, this is what’s happening and this is why. And by the way, we’re all doing it.

Josh Bressers (24:45) Yeah, yeah,

and you have a quote in this letter that says sustainability is not about closing access, it’s about keeping the doors open and investing for the future. Which is good because I think when conversations like this start, it’s often viewed as, you’re closing out the people you don’t like, right? And that’s not at all the intent. And I want to pick on one thing you said at the beginning of this particular ⁓ tangent, I’ll call it, is you mention about everyone should run their own.

Brian Fox (Sonatype) (24:56) Hmm

Yeah, exactly. That’s right. that’s yeah.

Josh Bressers (25:14) kind of internal mirrors and things like that. And I can tell you, Brian, that’s really hard to do. I have done this for many projects because I do a lot of weird stuff where I need like lots of open source downloaded for whatever like mad scientist thing I’m doing. It sucks, man. It sucks hard to make that work. And I think that is probably going to be part of this discussion at some point. Now granted, if I am a giant web scale company, yes, I can stand up a mirror, no problem.

Brian Fox (Sonatype) (25:17) Mm-hmm.

Josh Bressers (25:44) but Josh Bressers with his computer in his basement, that’s hard to do in a way that isn’t horrible.

Brian Fox (Sonatype) (25:44) Mm-hmm.

Ahem.

Yeah, and Josh Bresler’s with his computer in his basement, I hope, is not really the problem here, right? Okay, well, maybe you are, but in aggregate, it’s probably still relatively small, right? right. And this is the thing, like depending on the tool, it should be easy. If you’re using Maven, it’s not that hard.

Josh Bressers (25:56) I have been rate limited by nearly every package repository on the planet at some point.

For most people, yes, I understand.

Okay.

Brian Fox (Sonatype) (26:15) if

you’re using other things. But what I uncovered and I documented this in my blogs is that some of the other not-maven build tools, even in my own ecosystem, make it really hard. And I only uncovered this because some large organizations were getting throttled and I started working with them to understand and they came back after a while and said, we can’t fix this. And I’m like, what do you mean? You can’t fix it. And they started explaining like, yeah, we have all of these gradle builds for example.

in order to implement a change, like they already had a repository manager. 100 % of the organizations I’ve spoken to that were in the top consumption already had Nexus or artifactory. Like I didn’t expect that to be the case and it was. And then it’s like, okay, why are you not effectively using it? And you unpack it and it’s like, because there’s no global switch, for example, in Gradle to just flip that and make it easy like we have in Maven, right? We found other cases where like, ⁓

Josh Bressers (27:00) Wow.

Sure. Okay.

Brian Fox (Sonatype) (27:15) in

NPM, there was the React Native stuff, was itself generating Gradle. So it was generating Gradle scripts, and there was no place further for them to be able to do it. So in one instance, the organization was telling me, I’d have to go edit like 1,000 or 10,000 individual files to make this work.

In another one, they were saying, I literally can’t do it because the files themselves get generated at runtime. So I run an NPM build. It spawns off like a thousand Gradle builds, each one of those Gradle builds downloading a bunch of the same things over and over again, you know, and it’s, it’s things like that. So I believe you that it’s hard. And part of my point here is the tools in the environment should make it easy, right? Just like the open source project that I was talking about a few minutes ago, there was literally no way to

change that baked in thing, but they now have a pull request open to actually fix that, which is great. Most of the time, this stuff is not intentional, which is why I use the word abuse carefully here. You know, it’s certainly at least unintentional abuse. And it’s partly just because the economics never forced people to think about it. And I’ve used sort of, you know, the cap and trade carbon concept about this, you know, like the more carbon you admit, society at large tends to pay the price.

Josh Bressers (28:08) Thanks.

Right, right.

Brian Fox (Sonatype) (28:33) for that but the individual you know producer of that ⁓

doesn’t usually factor that in and that was part of the logic behind the cap and trade to put a little bit of a price on this to make people think about maybe I should do things a little bit more sustainably. And the same is true for downloads. If there was a rate limit or an implied cost to some of these things, people might think, maybe I’m not gonna download this thing all the time. ⁓ And it goes on the publishers, by the way. I wrote about this in the blog that I found

one publisher who was mocking a release every commit because one time in the past he had some issue with his release. So now he would build and publish to our infrastructure where we would do all of the validations on every single commit and then he would just drop it. And I emailed them and was like, dude, what are you doing? Like, and that’s one of the things that started this. And I literally said to him, like, listen, if you don’t care about wasting my money, at least be a better carbon burner for the rest of us. Cause you’re spending

Josh Bressers (29:18) ⁓

Brian Fox (Sonatype) (29:38) these machines for literally nothing, you know. ⁓ And I was talking to another large organization publishing an SDK the other day and they said like we publish every day like 400 jars. said, wow, okay, those all change every day? They said no, one or two of them, but we like to keep the version number the same as a convenience. And I kind of scratched my head and said, like if there was an implied cost for you to do it at this scale, you would at least reflect on how much do you care

about that convenience. Are you willing to pay for this? Because it just doesn’t make sense, you know? And in Maven Central, nothing ever comes down. So all of those things that they’re just republishing the same jar and only changing the file name, we carry that baggage forever, right?

Josh Bressers (30:11) Yes, right.

Brian Fox (Sonatype) (30:25) So it’s these kind of subtle things that have to change. And the only way that humans really start thinking about this is when you start applying a cost to using. You can use so much, but you can’t use or produce beyond reasonable levels. And I think that’s generally where this is headed.

Josh Bressers (30:35) Yes.

Yes, okay, so

let’s talk about that. Let’s talk about like, what do we do? Because I think this letter lays out the problem very well. I think you’ve made a very compelling case up to this point. But I also feel like a lot of these open source discussions about some of the challenges we have in the open source universe is, okay, great, now what? And it feels like, you know, that South Park underpants gnomes episode where we’re in the question mark stage and profit is next, but.

Brian Fox (Sonatype) (31:03) Mm-hmm.

Right. Right.

Right.

Yeah, mean, think some of the things certainly I’ve explored, I’ve explored partnerships with others in the ecosystem that are interested in making users aware of things like they support end of life components and working with partnerships where they pay us to help get access to that. ⁓ Get that visibility is a thing I’ve explored. We’ve explored. ⁓

you know, offering to organizations who can’t get to a reasonable consumption to basically pay for higher tiers. ⁓ You know, we’ve explored ⁓ publishers, some of these higher frequency publishers may be paying. We haven’t moved in that direction yet, you know, but I don’t think it’s unreasonable to ask a commercial organization producing hundreds of jars every single day because they can to help pay for some of this infrastructure, right? ⁓

And so we’re millions of dollars away from break even, right? So this isn’t about profit seeking, it’s about trying to at least balance this out. And if we do it right, it will mean that Josh and his basement can continue to do things at a reasonable level. ⁓

Mostly for free if you want to make your stuff go faster than maybe you can pay a little bit more I mean that’s true in the cloud you can always buy a bigger faster machine if you want your answer faster, right? So that doesn’t feel completely insane for you to say like I could do it It’s gonna I’m gonna have to do it below the limits, but if I want it now, then I chip in some money ⁓ You know ⁓ And those types of things I think start to become reasonable

Josh Bressers (32:53) Yes, I agree with everything you said, but the interesting part is going to be, but open source is free, Brian. Right? I feel like we have this ingrained in our heads, but-

Brian Fox (Sonatype) (33:05) Yeah, that, that, that… It is

free. Listen, I’m making all of this open source stuff available for you to get. All I’m saying is like, I don’t think it’s fair for you to come in and take it 500,000 times a month. Right? Um…

Josh Bressers (33:19) Yes, that’s fair. I mean, yeah, yeah.

Brian Fox (Sonatype) (33:23) You know, and worse, know, somebody, somebody used this analogy and I think it, I think it’s useful. It’s like, listen, you know, if you were running a soup kitchen, you know, and some people are coming in and getting soup, you’re happy to serve them. And sometimes you might wonder like, does that person in the suit really need to get free food? Like I’m going to give them free food, but I think everybody would say what’s not okay is for somebody to come in and take the whole pot of soup and then go sit outside and start selling it. Right.

Josh Bressers (33:50) Yes, that

Brian Fox (Sonatype) (33:51) And that’s

Josh Bressers (33:51) is yes.

Brian Fox (Sonatype) (33:52) close to what we’re talking about here. We’re saying, listen, we’re happy to serve this. We’re happy to provide these as services for the world and to do it for free, to help open source, to help all of these things. We are part of the ecosystem, right? But we just don’t think it’s right in that context for somebody to say, like, I don’t want to run a caching proxy, so I’m just going to download this 500,000 times this month. I don’t think anybody would stand up and say, like, no, that’s against the open sourced ethos for you

Josh Bressers (34:15) Yeah, yeah.

Brian Fox (Sonatype) (34:21) say don’t do that. I mean I certainly would hope that’s the case. Well, know, I don’t know.

Josh Bressers (34:24) They’re gonna say it and you know it, right? Like, ⁓ poor little me.

It’s not fair, but…

Brian Fox (Sonatype) (34:34) I have been surprised. think most people have ⁓ looked at what we’ve said and said, yes, this is logical. We haven’t seen any crazy, crazy conspiracy theories pop up yet, which is good. And I think that tells me we’re on the right path. We’re thinking about it the right way. We are very cautious about making sure that we’re not violating open source norms here. We’re just trying to highlight a potential problem. ⁓ It’s great that open source is so important to all

of these infrastructure pieces to all of these things. It’s helping society get better things faster, all those things. That’s all great. But we can’t continue to just mine these things indefinitely. It’s like fishing a species to extinction. Let’s slow down. Let’s think about how we can provide mechanisms to make this sustainable. And then we can build on top of that.

Josh Bressers (35:20) Yeah. Yeah.

Yeah, and actually there’s a, I’ll put a link in the

show notes, but the Atlantic Council wrote a really nice paper about open source sustainability. And they compare the ecosystem to like the environment of like water is their example, right? Where if you, can do things to use water in a way that it will be available theoretically forever, or you can do it in a way that in like five years, like no one can drink the water and it’s completely ruined. And right, we haven’t reached that point yet, but let’s make sure we keep it.

Brian Fox (Sonatype) (35:42) Yes.

That’s exactly right. Yep. Yep.

Josh Bressers (35:58) you know, healthy and good. So, okay, I want to hold your feet a little more to the fire here, Brian, because you’ve talked about some, think, interesting ideas, but like, what are the actual next steps? Like if I’m someone who actually cares about this, like what, how, what can we do? Like what are some actual real things we can start doing to help push this forward and make a difference?

Brian Fox (Sonatype) (36:19) Well, mean, the very practical thing is take a close look at how your systems are interacting with these public registries. Can you reduce your footprint? You know, and I’ve had some people reach out to me and ask and, you know, and I’ve pulled some statistics for people to make it easy. I understand that it can be hard for organizations at massive scale to really get their arms around this. But, you know, we’re willing to help provide that visibility if you’re if you’re trying to solve the problem, you know. So that’s the most practical thing is to reduce your footprint. you’re

If you’re publishing these things on every commit just as a test, maybe stop. If you’re publishing things over and over again every day, maybe stop. Those are things you can do today. ⁓ If you’re building a tool in the ecosystem and you’re just downloading the same metadata from these registries over and over again, again, same thing. Don’t do that. cache it. If you’re a business, build a cache

make yourself not dependent upon these things. Like that’s just good business practice. Why are you operationally dependent upon open source infrastructure paid by somebody else? That’s not a good business plan in general. So these are the kinds of things. And again, these things have generally been unintentional. And that’s part of trying to raise the awareness here. And putting limits in place, charging for higher tiers,

Josh Bressers (37:35) Yes. Yes.

Brian Fox (Sonatype) (37:44) Those are things that people can choose to do or they can choose to reduce their footprint. In either way, we help solve the problem. ⁓

Josh Bressers (37:51) Yes, yes.

Well, I mean, but that’s very different in - I mean, I’ll use Docker Hub as my example. I’ll pick on them because, you know, Docker Hub used to be just like Wild West. And then they put usage limits on their data. In fact, I have, in the past, I have paid for a subscription just because like I need to pull more stuff than I can pull. So I guess I’m giving you money now and that solves my problem, right? Or I was actually like several hours into trying to set up a proper, you know, Docker cache and I’m like,

Brian Fox (Sonatype) (38:11) Yeah, yeah. Yeah, yeah.

Josh Bressers (38:20) This is too much work. Here’s $60. we’re done here. And it was good for them and good for me in that regard.

Brian Fox (Sonatype) (38:26) That’s right. And I think that’s a reasonable outcome. Right. ⁓ You know, and that’s the case. And it’s like, I think you would justifiably feel different if if if you weren’t able to get any of those things, if you had to pay just to get access to to bit one. That’s that’s I think over the line. And nobody here is talking about that. Right. But your convenience, your time is valuable, ⁓ you know, and and you can pay a little bit more to get it faster. That’s what I saying before. And that feels like a

Josh Bressers (38:47) Right, right.

Brian Fox (Sonatype) (38:56) trade exchange of value and that’s all that we’re asking for here.

Josh Bressers (39:00) Yeah, yeah, yeah. And

I mean, that’s a very new idea in this universe. Like, it will be very interesting to see how this unfolds as we go forward.

Brian Fox (Sonatype) (39:10) Yes, it is so new and that is why everybody felt like nobody wanted to speak up until they were all speaking up together.

Josh Bressers (39:18) Yeah, which is

great, which is really cool. And that’s now that’s one thing you did mention to me is you kind of brought this up with some of the ecosystems and they were all like, yeah, I have that problem too. But no one was saying it in public, which is it’s like everyone was suffering in silence, right?

Brian Fox (Sonatype) (39:29) Mm-hmm. Mm-hmm.

More or less. Yeah. I mean, you know, like I said, I’ve been chronicling my journey as I’ve kind of tried to rationalize this. But once I understood the scale and started talking to others and realized like this is a problem for everybody and maybe even in some cases more acute, you know, the not for profits, they don’t have big banks to lean back on. You know, if our our bandwidth bill goes crazy, my CFO is going to call and, you know, ask me what’s going on. But if I’m open source project and I lose a couple

⁓ of members in my bandwidth bill quadrupled in a year, what do do? There’s no place to fall back on. So that’s when I realized, wait a minute, yeah. Go ahead.

Josh Bressers (40:14) Right. Well, I haven’t, well, I know.

Right, and I know

in that sort of situation, there’s a lot of donated bandwidth, right? And that’s also not sustainable for the donators, right? Because I’m sure they’re feeling the pressure as well, so.

Brian Fox (Sonatype) (40:26) Mm-hmm. Mm-hmm.

Yes. Yes.

That is,

I’m glad you brought that up. mean, because Sonotype’s not not-for-profit, we weren’t able to get that bandwidth for free. tried. We recently moved to Cloudflare, and they’re really great in working with us on this. They understand the importance of it. But everybody else is getting it basically for free. And I cautioned them. said, listen, don’t make the same mistake. At some point, that’s going to get significant enough that it’s going to be

Josh Bressers (40:40) Right.

Brian Fox (Sonatype) (41:02) problematic. You should start thinking about how to keep this growth under control now and envision a future where you don’t have to be dependent upon one provider to give it to you for free. You you’re going to make the monster bigger and it will be harder later. So use the opportunity now to get these things under control.

Josh Bressers (41:14) Yeah. Yeah.

Yep, for sure. Well, and that happened to, it was one of the, I think the C sharp ⁓ like Nougat or one of those repos. I can’t remember the name of it now. One of the, what the .net package managers that yeah, they, that might’ve been, yeah, they lost their CDN and it was just chaos for a long time. Yeah, yeah. All right, Brian, this has been a fire hose.

Brian Fox (Sonatype) (41:31) Cocoa pods? Yeah.

Yeah, was it CocoaPods maybe? Yeah.

Mm-hmm. Mm-hmm. Yeah.

Josh Bressers (41:49) But I think this is a good topic. think also like one of the challenges now is to find ways to get this word out. And so this is like my plea to everyone listening is read the letter and tell all your friends. Because while sometimes to us security nerds it’ll be like, everyone should know this. was published on the OpenSSF blog. I have a suspicion with the size of tech, this is a very real problem that is going to take time to just.

Brian Fox (Sonatype) (42:11) Right.

Josh Bressers (42:18) make sure it gets to the places it needs to go. Yeah, and it’s a big deal.

Brian Fox (Sonatype) (42:21) Mm-hmm. Yep.

Yeah.

Josh Bressers (42:25) Alright man.

Brian Fox (Sonatype) (42:26) Like

I said, I think it’s it’s it I do think that over the long term when we sort this out, it can be an opportunity. can it can help break the cycle that we’ve all been in around paying the maintainers. Nobody wants to pay the maintainers. You know, the infrastructure, if the infrastructure starts to take care of itself and, you know, maybe maybe starts producing a little bit extra, that extra can go back into, you know, the actual maintainers who are driving all of this. Maybe this is the final thing instead of just relying on goodwill donations. ⁓

You know, I think so. I just don’t know how it continues otherwise.

Josh Bressers (43:01) I don’t think it’s

a final thing, Brian. I think it might be the beginning of the first. So, it’s been a treat, man. Thank you so much. And I cannot wait to talk to you in a little while to see how this is all going. So it should be fun.

Brian Fox (Sonatype) (43:06) Maybe, yeah.

Yeah, thanks for having me.