Schneier on Security
A blog covering security and security technology.
« Story of a CIA Burglar |
| "Ask Nicely" Doesn't Work as a Security Mechanism »
October 10, 2012
The Insecurity of Networks
Not computer networks, networks in general:
Findings so far suggest that networks of networks pose risks of catastrophic danger that can exceed the risks in isolated systems. A seemingly benign disruption can generate rippling negative effects. Those effects can cost millions of dollars, or even billions, when stock markets crash, half of India loses power or an Icelandic volcano spews ash into the sky, shutting down air travel and overwhelming hotels and rental car companies. In other cases, failure within a network of networks can mean the difference between a minor disease outbreak or a pandemic, a foiled terrorist attack or one that kills thousands of people.
Understanding these life-and-death scenarios means abandoning some well-established ideas developed from single-network studies. Scientists now know that networks of networks don’t always behave the way single networks do. In the wake of this insight, a revolution is under way. Researchers from various fields are rushing to figure out how networks link up and to identify the consequences of those connections.
Efforts by Havlin and colleagues have yielded other tips for designing better systems. Selectively choosing which nodes in one network to keep independent from the second network can prevent “poof” moments. Looking back to the blackout in Italy, the researchers found that they could defend the system by decoupling just four communications servers. “Here, we have some hope to make a system more robust,” Havlin says.
This promise is what piques the interest of governments and other agencies with money to fund deeper explorations of network-of-networks problems. It’s probably what attracted the attention of the Defense Threat Reduction Agency in the first place. Others outside the United States are also onboard. The European Union is spending millions of euros on Multiplex, putting together an all-star network science team to create a solid theoretical foundation for interacting networks. And an Italian-funded project, called Crisis Lab, will receive 9 million euros over three years to evaluate risk in real-world crises, with a focus on interdependencies among power grids, telecommunications systems and other critical infrastructures.
Eventually, Dueñas-Osorio envisions that a set of guidelines will emerge not just for how to simulate and study networks of networks, but also for how to best link networks up to begin with. The United States, along with other countries, have rules for designing independent systems, he notes. There are minimum requirements for constructing buildings and bridges. But no one says how networks of networks should come together.
It's a pretty good primer of current research into the risks involved in networked systems, both natural and artificial.
Posted on October 10, 2012 at 8:18 AM
• 14 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
nice article, but isn't this just a "logical/inevitable" extension of systems thinking, or dynamic systems? In an even broader sense, this is the same thinking that people are applying to climate studies. Obviously, we are somewhat limited by our resources, but to a greater extent we seem to be constantly limited by "our" imagination. I am sure there have been people screaming warnings about this for decades, but because of institutionalized confirmation bias, it takes something like this to break through the fog.
My science class was teaching this to middle school students over 20 years ago. As a nation we are having a recession of imagination and ideas. Just look at our pathetic space program. There is a great example of group think, creating a confirmation bias that demeans the net benefits of funding big, bold, daring projects. Just ask Neil Degrase Tyson about the benefits, that is very Multiplex. I hope that this will give the creative & imaginative people a bigger voice, and chance to solve our problems (or better yet prevent them), but this is going to just turn into a giant lobbying money sink.
Yes, densely connected systems (whether one network or many overlapping networks) are susceptible to the spread of contagions. Of course contagions, like ideas and innovations, can be neutral or positive.
In a densely connected system, it almost does not matter where the contagion starts, soon the whole system is infected. See: Contagion amongst Banks
The obvious solution to unstable highly coupled networks is to introduce incompetence and inefficiency to decouple the parts and add inertia
I wonder why this hasn't occurred to any of the multinational bureaucracies involved in this study?
As marco says, this is a fairly standard principle, excessive coupling of systems causes instability. This is a standard principle in implementing even a single application, so why is it a surprise when it becomes an issue with super-masive networks-of-networks?
While it's intuitive that the risk can be managed by outright breaking links, wouldn't it be a better goal to build modularity into the systems being linked rather than trying to cope with the promiscuity of existing systems?
For example, in the "Cascades of Failures" section, you have, "From there it crashes communication networks, which in turn take out power stations.", why the HELL would a failing communication network take down a power station, even one, much less all of them?
As critical infrastructure, power stations should either have a safe fallback operational mode when isolated from the network, or a dedicated network that isn't susceptible to this sort of threat.
This reminds me quite a bit of 'Normal Accidents' by Charles Perrow.
The funny side of this is there has been quite a bit of related work carried out in this area but it's more or less clasified still some thirty years down the road.
Think about the network as a set of inter-relating hardware moduals that you are trying to eliminate information leakage due to side channels from.
The way you deal with it is relativly simple but tends to be "inefficient" for a number of reasons.
You first identify "chains", "feed back loops", "feed forward loops" and "storage elements" and identify if you want the system to be synchronous (easy but very innefficient) or asynchronous (hard but generaly efficient) or a mixture.
You then go on to develop state charts and try to eliminat all loops and storage elements and build the chains into balanced trees etc. You also try to reduce to a minimum the number of states between any elements in the system. Often the simplest way is to design the system to "fail hard" on any error or ambiguity and back off and restart slowly.
Whilst it works to produce secure systems it is generaly quite inefficient and costly for any given level of performance.
The discussion of modularity in networks reminded me of the debate regarding the roles of the Federal, State, and local governments. The ongoing movement to centralize everything into the Federal "network" would seem to indicate risk for contagion (financial, security, etc). Decentralization into the smaller networks of State and local government would seem to add robustness to the system and failure of one part would not endanger the other parts. Not to make this a political discussion, but there do seem to be parallels there.
Railways - or at least the reputable ones - have known about this for a long time. Consider this 1960s training film, which covers a wide variety of delay causes but specifically mentions having one train wait for another.
Japan's railways are specifically engineered to avoid delay contagion. Not only is the Shinkansen network physically distinct from the much slower conventional trains - to the point of being a different track gauge - but the conventional lines are as near to self-contained as possible, running their own stock and with very little through running (limited to freight and sleeper services). This means that a delay on one line does not automatically result in a delay on another.
Very frequent passing loops (there is usually one at each station) also minimise delay contagion for trains running in opposite direction on a single track line. Even the Shnkansen is segmented, with trains from one section (eg. Tokyo-Osaka) usually not running through to another (eg. Tokyo-Aomori). Passengers have to change more often, but the connections are reliable enough that this is not a major inconvenience.
A blockage of the Shinkansen lines is also not catastrophic for travel between the major cities, because the pre-existing conventional lines that run parallel to them were left in place, not least to serve smaller intermediate towns. Business travellers between the major cities would still be able to travel on these lines if the Shinkansen were stopped, and presumably special express trains would be laid on to accommodate them - or they could take a plane, if speed were essential.
By contrast, Britain's rail network was savagely cut back about 50 years ago - by a committee headed by officials with tight links to the road building industry. As a result, all those "wasteful" duplicate lines that could have been useful diversionary routes (and were widely used as such during the war) are gone. When a main line is blocked today, for example by the flooding and washouts a couple of weeks ago, it can be extremely difficult to find a way past it without resorting to a road-based replacement service.
... why the HELL would a failing communication network take down a power station, even one, much less all of them?
But if the power stations aren't on a comm net how can we friend them or follow their tweets? These connections are seemingly crucial to success.
Looks like they understood that robustness required some redandancy (horizontal and vertical). They should be balanced based of preset level of robustness for particular type of network. My guess that redandancy assumes not only existence of parallel nodes in a case of failure, but overlapping similar functions of some nodes which belong to the different networks (e.g. like Federal and State governments).
It is looks similar for balancing security and privacy on data protection.
Clearly this has been on many minds for decades. And many people are currently thinking around it in a variety of ways. On my mind has been the very nature of the economic system which predetermines the shape of most systems. Capitalism (and the individual-group distortion of risk synthesised by the legal structure of corporations) ensures that power laws are biased towards fragile nodes. Consequently both private and public activities are based on systems deliberately designed for short-term value extraction at the cost of fragility and greater likelihood of longer-term failures. These nodes do tend to be separated from each other by principles of competition but not from those practising robustness but who will nevertheless fail due to their inevitable dependencies on the fragile nodes where greater value is processed. As an artisan who provides basic needs and as someone who grows my own food i am someone who has disconnected from industrial capitalist processes to a high degree, as an illustration of how i make myself robust by getting away from the feedback loops. But this is no solution any more than would be to sell my major share in a company and run before it peaks and fails. What we need is adjustment to corporation law so that high short-term risk is not built into the system by default.
Networks of metaphores are always particularly attractive and invulnerable to logic attacks.
“why the HELL would a failing communication network take down a power station, even one, much less all of them?
As critical infrastructure, power stations should either have a safe fallback operational mode when isolated from the network, or a dedicated network that isn't susceptible to this sort of threat.”
I don’t claim to be an expert on grid operation but this seems a bit simplistic. “The grid” IS a communication network! By its very nature each element, be it a switch, transformer or a generator or whatever is interconnected with the rest of the system. No element can act in isolation or without affecting the rest of the network. It is this dynamic that drives the demand for coordination on some level in order to prevent one element from doing harm to another. The only safe mode for a generation station would be to isolate its self from the rest of the network. This will preserve the station but it will not keep the lights on. This is more or less, what happens with large scale cascading blackouts. Each station one by one chooses to isolate its self in order to protect its self leaving the remaining stations unable to “hold up” the grid and causing them to go offline as well, worsening the problem overall. To me, the miracle is that it does not happen more. I presume, the only way to stop this cascade would be to get in front of it by way of very rapid communications that would allow isolation switches to be opened proactively in an attempt to catch the brown out wave and isolate the collapse.
I’m sure there are smart people working on mitigating this dynamic given the economic cost of a blackout. I hope they are also working to build in safety margins. Blackouts generally don’t do substantive damage to the grid its self but an event that did, such as an EMP (natural or manmade) or a hacked grid control network, would be devastating to society. There is a CIA study rumored to be bouncing around the interweb that predicts rapid societal breakdown and that most Americans would be dead within a year or two in the event of a sustained wide area grid loss.
The grid is the system of all systems. Without it virtually all other systems will soon fail as well.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.