Disaster Recovery for the Internet

Interesting GAO testimony/report: "Internet Infrastructure: Challenges in Developing a Public/Private Recovery Plan," Gregory C. Wilshusen, Director, Information Security Issues, Government Accountability Office (GAO), October 23, 2007.

Posted on November 5, 2007 at 1:30 PM • 15 Comments

Comments

aikimarkNovember 5, 2007 2:55 PM

I thought the Internet was an expanded DARPANET, complete with military grade communication infrastructure, immune from disasters.

Patrick FarrellNovember 5, 2007 3:35 PM

@aikimark: It would be nice if the internet was built that way. In theory, the internet is immune to local failures, but, in reality, there are routes that handle much more traffic than others. The so-called backbones. I recall a sysadmin telling me that there a just a few key locations that you could attack that could cause America to effectively lose internet connectivity.

Wireless should do something to help make us a little more stable. That and FTTP, imho.

FakeJohnNovember 5, 2007 3:41 PM

@Patrick Farrell - where do you think those wireless and FTTP links go when you move upstream a few hops? hey.... chokepoint!

DSANovember 5, 2007 4:14 PM

IIRC the Internet was designed to be able to re-route around failures as part of a Cold War effort to keep NORAD and D.C. connected in the event of a nuclear strike in the U.S. Which might have worked except the EMP from the blast would have burned out every system between NORAD and D.C.

Todd KnarrNovember 5, 2007 5:48 PM

Part of the problem's the conflict between efficiency and redundancy.

To build a reliable network that can handle disruption and disaster requires redundancy. You need each node linked to several other nodes by independent paths, so that disasters don't sever every path. You need multiple paths by which traffic can flow, any one of which can handle 100% of the traffic if need be. In short, you need a network with many times the number of links and amount of bandwidth needed.

A commercial network needs to be efficient. To minimize costs you need to reduce the number of links to the bare minimum, and to run those links at as close to 100% capacity as you can. Every redundant link, every bit of unused capacity, is overhead that's costing you money and not bringing in any revenue.

Take a look at the effects of the last port-worker's strike on the manufacturing and distribution chain for a case study.

merkelcellcancerNovember 5, 2007 8:40 PM

"lack of consensus on DHS’s role and when the department should get involved in responding to a disruption"

I hope DHS never gets involved, or even thinks it is involved.....

Anonymous CowardNovember 6, 2007 2:07 AM

Seeing that Bruce has no comments to make, just a link, thought to take the remarkable step of reading the doc before commenting...

Two things come to mind:
a. Internet: No clear definition of what the Internet means, within the scope of this document. Only a brief covering sentence acknowledges that the Internet exists outside the US. And the idea that US DHS can mandate a way to handle recovery of the 'Internet' (whatever that means) without help from outside the US is ridiculous.
b. Disruption: Once again, no definition of what a 'disruption' constitutes. If I can access my bank website but every 2nd link I access results in a time-out, does that constitute a disruption?

As usual to US Govt institutions, a plan has been built without a clear definition of what the objective is...

And we all know where that takes us.... (Hint: Country in the middle east...)

NicolaNovember 6, 2007 4:06 AM

@ DSA:

according to several sources, the story about the redundancy as a recovery in case of nuclear strike is a hoax, or exactly is a possibility presented by a "TIME" article of the sixties...
In reality one of the first developers of the link protocols wrote to military administration (at the time DARPA was ARPA and wasn't part of the dod) explaining the usefulness of the redundant network in case of failure of other communications (no mention about nuclear strike). The internet was developed few years later by some university (UC before all) and THEN deployed ALSO by militar agencies...

Terry ClothNovember 6, 2007 8:54 AM

``Key challenges [...] include
[...]
(5) leadership and
organizational uncertainties within DHS.''

Does this surprise anyone here?

Clive RobinsonNovember 6, 2007 11:42 AM

@Todd Knarr

"A commercial network needs to be efficient."

Unfortunatly as you noted as things become more "efficient" the more profitable they are. However they also become increasingly "brittle".

In practice you are asking for trouble if you run at 80% or more of capacity for over 20% of the time.

The same issue applies to members of staff within an organisation (remember Business Process Re-engineering).

In nature on the otherhand 60% is the normal maximum on efficiency. Over and above that and the species generally faces extinction whenever there is a slight change in the environmental circumstances.

Matt from CTNovember 6, 2007 11:55 PM

I dunno, it's late so maybe I'm brilliant or maybe I'm really confused.

Define what you want for disaster recovery.

If it's to restore functionality for basic e-commerce like access to your bank balance, etc; to provide basic web pages for news and such...do we really need a fully redundant architecture?

Specifically, could we establish certain services that would be designated in advance to be degraded -- YouTube, you're outta here for a few days or weeks.

Remember back to September 11, when the news organizations websites were flooded and they reverted to simple, uncomplicated just the facts HTML?

I'd have to think part of the solution in a crisis that has severely impacted the internet is to discriminate -- perhaps the more appropriate word is triage.

Yeah, Triage the Traffic -- I like that. Video streams, audio streams, torrents get temporarily degraded or outright stopped. Use the remaining connectivity for a high volume of low-bandwidth services, at the expense of a low volume of high-bandwidth services. And ease the restrictions (or unlock certain services, like audio streams of news stations and news-for-the-blind first, before, say, iTunes and broadcast music radio stations).

HarryNovember 7, 2007 9:51 AM

@aikimark: even if DARPANET were military grade communications (the record is unclear on this) by now it is not even close.

Or are you being ironic? Because as late as the mid 1980s, DOD's official backup communications network was AT&T. The Pentagon realized there was no practical way to duplicate the complexity, redundancy and robustness of the commerical phone network. I guess unsecure communications was better than none at all.

@Matt: you have the right idea but...
1) who decides?
2) when is it decided?

One possible answer of "who" is the government, which represents our collective will. (Mostly, more or less, on good days, whatever; I'm hoping to avoid a discussion of the US government.) The other is whoever happens to be able to limit access along some pipe or another. The former takes a very long time, the latter is likely to be arbitrary, capricious, and inconsistent.

Your idea won't work if it needs to be created and implemented after the disruption. So it needs to be done beforehand. The GAO report addresses some of the issues and problems with getting it done before.

BrianNovember 7, 2007 4:39 PM

This reads largely like DHS is trying to expand it's scope and budget.

Obviously positive points include providing fuel, and security assuming the local utilities and security agencies have collapsed. I don't think anyone would argue that the GAO should and does drive their needs including security with procurement practices.

Many points are being addressed by businesses as part of continuity planning, or county, and municipal agencies as part of their continuity planning. It's hard to see how DHS could improve this process. The involvement of politics and added complexity would make continuity more difficult.

I'm especially disturbed by DHS suggesting they should be "establishing a system for prioritizing the recovery of Internet service, similar to the existing Telecommunications Service Priority Program". This is done well by service providers today and I find it difficult to believe that DHS could do a better job than the people that design and maintain the networks.

Making internet service providers more like switched phone network providers may not be ideal for everyone involved.

DigitalCommandoNovember 7, 2007 7:50 PM

The DHS's sole focus in any matter regarding the internet, is to ensure that no design, idea or implementation, escapes their watchfull eye, allowing their continued unfettered access to every part of it. Their "cover" for this, is some pathetic attempt at patriarchal oversight and assistance under the guise of doing something good.

Matt from CTNovember 8, 2007 8:37 PM

>@Matt: you have the right idea but...
>1) who decides?
>2) when is it decided?

I suppose we're about a decade and maybe a couple years too late for this to be RFC'd, huh?

And from a collision between practical and legal, we may have conflicting priorities.

Obviously the organizations that would control the prioritization would the backbone providers. They control the fat pipes and choke points of the network were problems would be most likely to manifest themselves.

But what if Sprint and AT&T and the rest also Service Level Agreements in place? What they *could* do to restore basic internet traffic might very well conflict with SLAs and other contracts that they provide "equal" or "best effort" or "net neutral" or whatever words have been used in contracts with other companies that prevent them from discriminating against their traffic, and indeed penalize them if they do. This is speculative to say the least, but I could see the backbone carrier's having legal eagles who tell them to keep everything degraded at the same poor levels, rather then allow most stuff to function but in doing so violate contracts with other important companies.

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Photo of Bruce Schneier by Per Ervland.

Schneier on Security is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc..