Thoughts on Conway's Law and the Software Stack (2019)(blog.jessfraz.com)
The past should also communicate better with the present.
> ...you’d be crazy to think hardware was ever intended to be used for isolating multiple users safely.
But of course it was. Protection rings were invented for Multics, the first operating system specifically intended to isolate multiple users safely. And side-channel timing attacks were called out as a concern at the time.
We may have lost our way somewhere, but let’s not rewrite history.
IIRC some side channel attacks we would recognise as spectre were worked on in the 90s but the only really showed themselves properly in the 2010s
fwiw: she knows https://blog.jessfraz.com/post/why-open-source-firmware-is-i...
you're thinking at a different level than she is
She’s not saying this, she reports that hardware engineers are saying this. Which, if generally true, is sad.
"There are people thinking an interface or feature is secure when it is merely a window dressing that can be bypassed with just a bit more knowledge about the stack. " -- Jessie Frazelle shares one of the toughest problems when dealing with complexity. It's hard to understand where it starts and where it ends, and we create holes all around us due to the inherent limitations - It's too hard for humans to reason about so many moving pieces. What should we do about it? I don't know. Maybe make it more visible so that others can make better (not perfect) decisions. Perhaps acknowledge our limitations and practice more concepts from Resilient/Choas Engineering to see where our intuition breaks.
Chaos engineering is brilliant for improving the resilience, safety and availability of a system, but I'm not sure it would help improve the security of a system to the degree that the author describes, i.e. to prevent the SoftLayer firmware hack.
We need more knowledge, more education, more auditing, more communication, and this across all levels of the stack. If we use abstraction, perhaps we need to recognize that abstraction does not absolve us of the responsibility for auditing the implementation, and that abstraction has not only value but also cost.
To make this easier across so many dimensions, we probably want to simplify our stacks and minimize our dependencies and supply chains far more drastically than we do.
There's also a cost to all this understanding, and it comes down to what the business values. Security or velocity. But perhaps you can get both, especially as you design for simpler systems.
"Another example would be the hack of SoftLayer. Hackers modified the firmware on the BMC from a bare metal host the cloud provider was offering. This shows another mistake in having blinders on and not being conscious of the other layers of the stack and the entire system."
Does anyone know whether dedicated hosting companies such as Hetzner or xneelo reflash the firmware on server boards (and disks and everything with firmware in it) before they rent it out again?
And whether they use more forceful hardware techniques to do the reflashing, and not software to politely ask the firmware to reflash itself?
I would hope they would but I can't find any security policy showing they do, or how.
> There is not sufficient communication between the various layers of software.
I disagree with this. In fact, my conclusion is the opposite. It's a good pattern to limit the complexity of communication between different layers and have strong encapsulation of state. The simpler an interface is, the easier it is to integrate with other components and the less likely it is to run into integration issues and expose vulnerabilities.
The simpler the interface (with fewer, simpler endpoints), the fewer e2e scenarios you need to account for, this allows you to build more stable, more secure software which is easier to test. Modules which expose a large number of endpoints tend to encourage micromanagement of that module's internal state and this leads to issues.
You need to have clear separation of concerns between layers and components in order to be able to come up with the simplest interface possible which allows different components to interact.
Agreed, but only if & when the abstractions are not "leaky". Robust encapsulation is simply not possible when abstractions are leaky or ill-fitting for the domain -- including an incomplete model of the domain. The latter point -- needing a complete & consistent model of the domain -- is IMHO commonly under-appreciated; a lot of software is based on poorly understood partial abstractions for their respective domains.
Bits of information leaking through faulty abstractions are exactly what make conduits for "side-channel" attacks.
Leaky abstractions means you have the wrong mental model.
Abstractions (attempt to) hide complexity. Worst cases are black boxes.
Mental models attempt to clarify how something works, to make it easier to reason how a thing works.
Totally. A robust domain model is one of the main things that matters in terms of software architecture in my experience.
Yes true. You also want the data which is passed between the different layers to be as simple as possible to avoid leakiness. I try to avoid accepting complex instances as function arguments and try to avoid returning complex instances that will allow the user to manipulate shared state.
It takes a lot of effort and thinking to figure out an abstraction which fulfils all of its requirements while at the same time exposes a simple communication interface.
Modularity is just a strategy to mitigate bad design, not the cause.
Modularity enables divide & conquer, specialization, etc. Conway's Law is a good thing.
Architecture is the visible design choices, defined thru interfaces.
It'd be amazing if we divined some new strategies. Like maybe GA or the newer stuff.
Use NixOS, and at least you are standing atop the tower of Babel and able to see all the parties that don't communicate.
Maybe if everyone does this, the parties of each conversation that isn't happening will be aware of each other. That's a first step.
> Use NixOS, and at least you are standing atop the tower of Babel and able to see all the parties that don't communicate.
I haven't used NixOS but this sounds intriguing. Could you explain in a little more detail how it helps with this?
1. It's source (and binary) distro, so you can modify anything and there's no hiding how the sausage is made.
2. Because all our builds are so so sandboxed, there's this uncanny feeling that we sometimes know the dependencies / prereqs of a package better than the upstream developers!
> There is not sufficient communication between the various layers of software.
Because there is not enough communication between the engineers writing a this code lol. Ie: Conways law ;-)
There's probably some cutting joke one could make here, about our industry's increasing reliance upon incomprehensible artificial intelligence systems, and how that might reflect upon the management. But decency dictates that I shouldn't utter it.
Couple of good points, and a set of good questions. Wish I had answers to match!
Communication problems usually come down to people not understanding each other, which has many causes of course. Most of these have to do with people being all different. When people are all the same, less communication problems.
But having people all the same then introduces non-communication problems. Lack of genetic diversity is a potential catastrophe. Lack of outside perspective makes novel thinking difficult. Lack of change breeds complacency.
So you need your software to work pretty much all the same way, but you need diversity in how it works to survive environmental challenges.
Nature has a solution for this (as in most things): evolution. You mutate a bit every so often and eventually something works better. Then you stick with that and repeat. It's a continuous improvement cycle driven by measurably better outcomes from experiments.
This can work for hardware too but it's definitely not as straightforward. But no matter what you use it on, you can't have some clunky, hierarchical, reactive, conservative, single-minded power structure holding the process back. The stack will always suffer as long as there's a giant anchor at the top keeping it from being agile and mutating. It's not so much a cultural thing as an organism; if the organism can't agree on how to grow, it's not gonna fare well against competition.
We’ve replicated evolution in software with free software and agile. The problem is that the feedback loops are too long to prevent damage. The systems we have make some people the slow gazelle, they get eaten and don’t reproduce.
We need different people in software. We need more different people too and we need to communicate better. The real problem is the way we isolate our selves. I think the solution is less code and more understanding of what the limitations of existing code are. This requires us to prioritize communication and understanding.
2019 is not that long ago. Do you have any examples where the issues or concerns raised in the post have been satisfactorily resolved?
It’s a note to the mods to adjust the title. Anything that doesn’t have a date is supposed to be News.