I had a discussion with Thomas Depierre about his experience with safety and how safety concepts can apply to the field of security.
Thomas is an experienced SRE with a background in safety, he has thoughts into how people prevent disasters constantly, often without realizing it. You can find his blog at Software Maxims
An audio version of this disucssion is also available in podcast format. Look for “Open Source Security” wherever you get your podcasts.
The security industry likes to reinvent wheels. When I created the idea of Open Source Security I knew Thomas was one fo the first people I wanted to talk to, specifically about safety, and how security has a lot to learn from the safety folks.
In the world of safety, it’s often easy to understand the spectrum of harm. There can be a wide range from a machine breaking, to loss of human life. In security we don’t have such obvious negative outcomes from our incidents. Well, we’re starting to, a hospital with ransomware is without question harming human life no matter what anyone claims. But usually when we think of problems in the world of security it feel more like an inconvenience for a little while, then everyone goes back to pretending everything is fine a few days later.
Perfect or nothing
When I started this chat with Thomas I had the assumption that the safety universe has a more reasonable view of the world, nothing can be completely safe. While I think this is starting to change in security, there are still a lot of goals that revolve around being 100% secure instead of accepting some level of security that’s not perfect. The thing I didn’t think about is there are some spaces where we need perfect safety. A nuclear reactor is an example where planning for failure isn’t reasonable. We have to operate a nuclear reactor in a way that perfect is the only option.
This is a space that will have to evolve with security in the future. While perfect security isn’t practical, there will be spaces where we can’t afford to let security fail. Something like the information security of a hospital harming human life will never be acceptable. It’s sort of a weird problem to discuss, when is it OK to “accept the risk”?
Greater than the sum of the parts
The other thing Thomas hit on that I hadn’t put a lot of thought into is how you can take a bunch of unreliable components and turn them into something more reliable. The easiest example here is a hard drive RAID array. You can use a bunch of somewhat reliable drives to make one very reliable array.
This feels like the description of modern IT. With all the services, microservices, containers, workers, and just everything. Sometimes it feels like nothing should work, yet the world seems to function. All those stories about how nothing is secure and everything is terrible appear to have been mostly untrue.
The logical extension of this idea is that multiple insecure systems could be more secure, but that claim probably needs more research before anyone would be willing to make the claim. If you know of some evidence that suggests this, do let me know.
Humans are the … solution?
The final takeaway from Thomas that I still struggle to wrap my head around is that humans are the reason everything works. We love to blame people in the world of security. Just a few years ago the term “you can’t patch people” was thrown around all the time. The idea was people are the problem and only security can save us from ourselves.
Thank goodness that attitude has vanished … right?
In the world of SRE, and Devops, and probably every other industry, there are people who do little things constantly to keep everything running. Restarting crashed services, renewing a domain or certificate, fixing a bug before the memory leak gets out of hand. We all do little things like this all day every day.
My example for this is when I go on a work trip. It seems like all the technology in the house fails when I’m gone, but that’s because it does. I do little things every day to keep the TV working, and the internet up, and the computers updated. When I’m gone, nobody does any of this, so things start to slows fail. I don’t even know what I do, but it clearly does something.
Why does this matter? Because people aren’t the problem. If people stopped working we wouldn’t be more secure, nothing would work, all our infrastructure would collapse. Security needs to stop blaming people as a source of problems. All those links they click and documents they open is how they get work done.
It was a pretty fun chat, Thomas has some great points of view and I always learn a lot when I speak with him. I look forward to his next visit.