On the Cyber Safety Review Board

When an airplane crashes, impartial investigatory bodies leap into action, empowered by law to unearth what happened and why. But there is no such empowered and impartial body to investigate CrowdStrike’s faulty update that recently unfolded, ensnarling banks, airlines, and emergency services to the tune of billions of dollars. We need one. To be sure, there is the White House’s Cyber Safety Review Board. On March 20, the CSRB released a report into last summer’s intrusion by a Chinese hacking group into Microsoft’s cloud environment, where it compromised the U.S. Department of Commerce, State Department, congressional offices, and several associated companies. But the board’s report—well-researched and containing some good and actionable recommendations—shows how it suffers from its lack of subpoena power and its political unwillingness to generalize from specific incidents to the broader industry.

Some background: The CSRB was established in 2021, by executive order, to provide an independent analysis and assessment of significant cyberattacks against the United States. The goal was to pierce the corporate confidentiality that often surrounds such attacks and to provide the entire security community with lessons and recommendations. The more we all know about what happened, the better we can all do next time. It’s the same thinking that led to the formation of the National Transportation Safety Board, but for cyberattacks and not plane crashes.

But the board immediately failed to live up to its mission. It was founded in response to the Russian cyberattack on the U.S. known as SolarWinds. Although it was specifically tasked with investigating that incident, it did not—for reasons that remain unclear.

So far, the board has published three reports. They offered only simplistic recommendations. In the first investigation, on Log4J, the CSRB exhorted companies to patch their systems faster and more often. In the second, on Lapsus$, the CSRB told organizations not to use SMS-based two-factor authentication (it’s vulnerable to SIM-swapping attacks). These two recommendations are basic cybersecurity hygiene, and not something we need an investigation to tell us.

The most recent report—on China’s penetration of Microsoft—is much better. This time, the CSRB gave us an extensive analysis of Microsoft’s security failures and placed blame for the attack’s success squarely on their shoulders. Its recommendations were also more specific and extensive, addressing Microsoft’s board and leaders specifically and the industry more generally. The report describes how Microsoft stopped rotating cryptographic keys in early 2021, reducing the security of the systems affected in the hack. The report suggests that if the company had set up an automated or manual key rotation system, or a way to alert teams about the age of their keys, it could have prevented the attack on its systems. The report also looked at how Microsoft’s competitors—think Google, Oracle, and Amazon Web Services—handle this issue, offering insights on how similar companies avoid mistakes.

Yet there are still problems, with the report itself and with the environment in which it was produced.

First, the public report cites a large number of anonymous sources. While the report lays blame for the breach on Microsoft’s lax security culture, it is actually quite deferential to Microsoft; it makes special mention of the company’s cooperation. If the board needed to make trades to get information that would only be provided if people were given anonymity, this should be laid out more explicitly for the sake of transparency. More importantly, the board seems to have conflict-of-interest issues arising from the fact that the investigators are corporate executives and heads of government agencies who have full-time jobs.

Second: Unlike the NTSB, the CSRB lacks subpoena power. This is, at least in part, out of fear that the conflicted tech executives and government employees would use the power in an anticompetitive fashion. As a result, the board must rely on wheedling and cooperation for its fact-finding. While the DHS press release said, “Microsoft fully cooperated with the Board’s review,” the next company may not be nearly as cooperative, and we do not know what was not shared with the CSRB.

One of us, Tarah, recently testified on this topic before the U.S. Senate’s Homeland Security and Governmental Affairs Committee, and the senators asking questions seemed genuinely interested in how to fix the CSRB’s extreme slowness and lack of transparency in the two reports they’d issued so far.

It’s a hard task. The CSRB’s charter comes from Executive Order 14208, which is why—unlike the NTSB—it doesn’t have subpoena power. Congress needs to codify the CSRB in law and give it the subpoena power it so desperately needs.

Additionally, the CSRB’s reports don’t provide useful guidance going forward. For example, is the Microsoft report provides no mapping of the company’s security problems to any government standards that could have prevented them. In this case, the problem is that there are no standards overseen by NIST—the organization in charge of cybersecurity standards—for key rotation. It would have been better for the report to have said that explicitly. The cybersecurity industry needs NIST standards to give us a compliance floor below which any organization is explicitly failing to provide due care. The report condemns Microsoft for not rotating an internal encryption key for seven years, when its standard internally was four years. However, for the last several years, automated key rotation more on the order of once a month or even more frequently has become the expected industry guideline.

A guideline, however, is not a standard or regulation. It’s just a strongly worded suggestion. In this specific case, the report doesn’t offer guidance on how often keys should be rotated. In essence, the CSRB report said that Microsoft should feel very bad about the fact that they did not rotate their keys more often—but did not explain the logic, give an actual baseline of how often keys should be rotated, or provide any statistical or survey data to support why that timeline is appropriate. Automated certificate rotation such as that provided by public free service Let’s Encrypt has revolutionized encrypted-by-default communications, and expectations in the cybersecurity industry have risen to match. Unfortunately, the report only discusses Microsoft proprietary keys by brand name, instead of having a larger discussion of why public key infrastructure exists or what the best practices should be.

More generally, because the CSRB reports so far have failed to generalize their findings with transparent and thorough research that provides real standards and expectations for the cybersecurity industry, we—policymakers, industry leaders, the U.S. public—find ourselves filling in the gaps. Individual experts are having to provide anecdotal and individualized interpretations of what their investigations might imply for companies simply trying to learn what their actual due care responsibilities are.

It’s as if no one is sure whether boiling your drinking water or nailing a horseshoe up over the door is statistically more likely to decrease the incidence of cholera. Sure, a lot of us think that boiling your water is probably best, but no one is saying that with real science. No one is saying how long you have to boil your water for, or if any water sources more likely to carry illness. And until there are real numbers and general standards, our educated opinions are on an equal footing with horseshoes and hope.

It should not be the job of cybersecurity experts, even us, to generate lessons from CSRB reports based on our own opinions. This is why we continue to ask the CSRB to provide generalizable standards which either are based on or call for NIST standardization. We want proscriptive and descriptive reports of incidents: see, for example, the UK GAO report for the WannaCry ransomware, which remains a gold standard of government cybersecurity incident investigation reports.

We need and deserve more than one-off anecdotes about how one company didn’t do security well and should do it better in future.  Let’s start treating cybersecurity like the equivalent of public safety and get some real lessons learned.

This essay was written with Tarah Wheeler, and was published on Defense One.

Posted on August 6, 2024 at 7:01 AM16 Comments

Comments

Duncan Hart August 6, 2024 11:32 AM

Bruce,
With increasing complexity of software intensive systems do you think we’ll be unable to distinguish between system accidents and criminal incidents?

Clive Robinson August 6, 2024 11:39 AM

@ Bruce,

I’m not going to go into details but the CSRB is the result of a political turf-war and got emasculated as a result.

For it to get the powers of the NTSB then there needs to be the same sort of situation that gave rise to the need for the NTSB…

But also there needs to be in place legislation to stop it becoming “A boys club” with a “revolving door” or “regulatory capture”,

“Likelihood of regulatory capture is a risk to which an agency is exposed by its very nature. This suggests that a regulator should be protected from outside influence as much as possible. Alternatively, it may be better to not create a given agency at all. A captured regulator is often worse than no regulation, because it wields the authority of government.”

https://en.m.wikipedia.org/wiki/Regulatory_capture

The simple fact is that most Federal Agencies are corrupt in some way but people turn their eyes and look away.

Solving this issue requires some quite major changes in the political system, that are not going to happen over-night or at all if some have their way.

Oh and if people think I’m being “party political” I’m not. As so delicately noted by Ralph Nader,

“The only difference between the Republican and Democratic parties is the velocities with which their knees hit the floor when corporations knock on their door. That’s the only difference.”

The problem “Is the System” and without taking a hatchet to it and turning it into kindling for a weenie roast we will be having this conversation over and over.

cyrus August 6, 2024 7:34 PM

Yet there are still problems … First, the public report cites a large number of anonymous sources. While the report lays blame for the breach on Microsoft’s lax security culture, it is actually quite deferential to Microsoft; it makes special mention of the company’s cooperation.

Given the NTSB comparison, this reads a bit strangely to me: a common aspect of transportation safety investigations is that they explicitly don’t seek to assign blame. Quoting AAR-20-01:

The NTSB does not assign fault or blame for an accident or incident; rather, as specified by NTSB regulation, “accident/incident investigations are fact-finding proceedings with no formal issues and no adverse parties … and are not conducted for the purpose of determining the rights or liabilities of any person” (Title 49 Code of Federal Regulations section 831.4). Assignment of fault or legal liability is not relevant to the NTSB’s statutory mission to improve transportation safety by investigating accidents and incidents and issuing safety recommendations. In addition, statutory language prohibits the admission into evidence or use of any part of an NTSB report related to an accident in a civil action for damages resulting from a matter mentioned in the report (Title 49 United States Code section 1154(b)).

It’s not unique to the USA. Quoting a UK report: “The sole objective of the investigation of an accident or incident under these Regulations is the prevention of future accidents and incidents. It is not the purpose of such an investigation to apportion blame or liability.” Sweden: “SHK investigates accidents and incidents from a safety perspective. Its investigations are aimed at preventing a similar event from occurring in the future, or limiting the effects of such an event. The investigations do not deal with issues of guilt, blame or liability for damages.” Indonesia: “Readers should note that the information in NTSC reports and recommendations is provided to promote aviation safety. In no case is it intended to imply blame or liability.”

So, what’s wrong with anonymous sources? How would knowing a source’s name help improve “cyber safety”? And how’s it a problem to mention whether an affected company helped?

ResearcherZero August 6, 2024 11:42 PM

Loper Bright Enterprises v. Raimondo was a case challenging the validity of the Chevron doctrine, which allows courts to defer to agency interpretations of ambiguous statutes.

“The true impact of this ruling will likely be defined through years of litigation, as courts, agencies, and Congress grapple with its practical implications.”

‘https://www.jdsupra.com/legalnews/the-end-of-chevron-deference-what-the-3342874/

What does it mean?

Be prepared to update compliance programs if regulatory or legal requirements change as a result of jurisprudence, and establish a good working relationship with legal experts.

“Congress took relatively little action to establish security requirements in most industry sectors over the past decade, seemingly leaving it to agencies to take action. ”

As a result, federal agencies have often turned to older statutory mandates to update security regulations. Particularly with regard to critical infrastructure security, this has sometimes prompted agencies to take a “creative approach” to addressing modern threats on a sector-by-sector basis.

https://www.venable.com/insights/publications/2024/chevron-decision/cybersecurity-policymaking-post-chevron

It’s likely up to Congress… and what ever happens in the courts. (laypersons)

https://www.wired.com/story/us-supreme-court-chevron-deference-cybersecurity-policy/

ResearcherZero August 7, 2024 2:11 AM

Whereas decoupling means a complete break, de-risking is the action of reducing risk.

‘https://www.hinrichfoundation.com/research/article/trade-and-geopolitics/china-decoupling-vs-de-risking/

But what does de-risking actually mean in a practical sense?

xTwitter begins lawsuit against advertisers after they stop advertising…

‘https://www.nytimes.com/2024/08/06/technology/x-antitrust-suit-advertisers-elon-musk.html

“That’s how I feel, don’t advertise.”

Musk told advertisers they should stop spending on xTwitter.
https://www.wired.com/story/elon-musk-x-advertisers-interview/

More than half of Twitter’s top 1,000 advertisers stopped spending on the platform.

(pro-Nazi content had also been appearing alongside advertising from reputable brands)

https://edition.cnn.com/2023/02/10/tech/twitter-top-advertiser-decline/index.html

The consequence of firing most of your trust and safety team!?

(Twitter used to have around 1,500 content moderators.)

“Engineers focused on trust and safety issues at X had been reduced from 279 globally to 55, a fall of 80%. Full-time employee content moderators had been reduced 52% from 107 to 51.”

Musk had already laid off half the company’s 7,500 full-time employees in November 2022.

https://eu.detroitnews.com/story/tech/2024/01/10/x-corp-twitter-trust-safety-staff-australia-online-safety-watchdog/72173679007/

JonKnowsNothing August 7, 2024 8:38 PM

@Clive

re: Bulging in a Teflon seal and Richard Feynman

An upcoming very important Real Life execution of problematic code and problematic mechanical systems is the attempt to safely retrieve astronauts Butch Wilmore and Suni Williams from the International Space Station.

… in his appendix to the commission’s report (which was included only after he threatened not to sign the report), “For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.”

Richard Feynman Rogers Commission Report – Challenger disaster

A MSM Report gives a good overview of the current public issues

  • a bulging in a Teflon seal in an oxidizer valve known as a “poppet … engineers still don’t understand precisely why the bulging is occurring and whether it will manifest on Starliner’s flight back to Earth
  • Undocking software … the current version of the Undocking Software requires the crew on the other side of the docking interface to perform specific actions. Since the plan is that no one will be left on board the ISS, this software needs to be swapped out for Autonomous Undocking software where the system does every step in the process itself without crew on the other side. This code has not been reviewed or used since ~2022.
    • It was archived 2 years ago and now NASA needs to “resurrect the software parameters that are required to give automatic responses”.

Per documentaries, professional astronauts are given extensive training in Oh SHYTE conditions. Auxiliary astronauts may not have the same level of OH SHYTE training acceptance.

The article explains some of the internal NASA disagreements.

There is a binary outcome:

  • It will work
    • It will not work

====

ht tps:/ /ars technica.com/space/2024/08/nasa-official-acknowledges-internal-disagreement-on-safety-of-starliner-return/

Clive Robinson August 8, 2024 12:11 AM

@ Duncan Hart, ALL,

Re : Accident or Design?

“… do you think we’ll be unable to distinguish between system accidents and criminal incidents?”

It’s a question that comes up from time to time and some discussion comes by way of Dr. Edmond Locard’s “Exchange Principle” stated as,

“Every contact leaves a trace”.

He was a French pioneer in forensic science, who became known as the “Sherlock Holmes of Lyon”.

His formulation of what is considered to be the basic tenet or principle of forensic science is still taught today when dealing with “physical” forensic investigations. Even though it can be shown by those who work in the science of Metrology (measurment) to have fundamental limits.

But what of “informational” forensic investigations does the contact get recorded in some way?

Well the answer to that in reality is,

“It depends upon the whole system.”

And what gets recorded or not by way of,

1, Data
2, Meta-Data
3, Meta-Meta-data

Whilst all three exist around us in our and others information systems, three obvious questions arise,

1, Can it be accessed?
2, Can it be recognised?
3, Can it be evidence?

Currently the answer is,

“If it is not recorded at some point for some reason it is unavailable.”

Which is why in the past attackers destroyed logs by exhausting them in some way (such as sending thousands of “page feeds” or hundreds of thousands of “line feeds” to push all the paper through the “line printer/terminal used as the command console).

Even though that may eliminate the “Data” from being recorded there may well be “Meta-Data” in other logs or by easily visible changes (such as printers being empty, hard drives being full, or processes being crashed etc).

Then there is “Meta-Meta-Data” one form of which is to be able to show that something that should be there is not there. Though the meta-data or data that caused it is nolonger available.

For instance from other types of physical attack the simple fact that someone was not where they should have been when they should have been may be sufficient to throw sufficient suspicion.

For instance the dred mobile phone slave master and snitch on much of the world’s Western Population. It registers not just with the cellular network but with the likes of Apple, Google and Microsoft and App developers. Because there is the illusion that “Meta-data is currency”.

Thus your movements etc get logged 24×365.25 and from statistical analysis your “normal behaviour” can be seen. The fact your phone is off when it should be on is a “tell-tale”. Along with other historic information indicating what state the phone battery was in can turn it into “circumstantial evidence”.

We’ve already seen evidence built from data in a “fit-bit” that a person was moving and that was used to build a case to present to a jury,

https://www.theverge.com/2022/5/12/23068898/fitbit-murder-trial-testimony-scientific-evidence

But note how the scientist talks about the “measurement” and questions not asked thus not answered.

44 52 4D CO+2 August 8, 2024 1:12 AM

Holy Passive-Language Batman!

From the first link: ‘faulty update’

microsoft-outage-cause-azure-crowdstrike

Clive Robinson August 8, 2024 11:01 PM

@ JonKnowsNothing,

Re : Bulges are caused by excess.

“a bulging in a Teflon seal in an oxidizer valve known as a “poppet … engineers still don’t understand precisely why the bulging is occurring and whether it will manifest on Starliner’s flight back to Earth”

Hmm… Bulges happen when you have an excess pushing against a non linearity.

I suspect the journalist does not know or was not told about the various limits and how they change.

In theory below the “plastic limit” the “bulge” will return to the starting state once the pressure is removed.

However what can happen with what appears to be the plastic limit is it deteriorates due to repetitive deformation. Somewhat similar to “work hardening” in metals.

However there is the non fun issue of induced nonlinearity. You get this with washers and the like that are of non uniform thickness or surface finish. The result is that a seal can under pressure move in a non uniform way and much like squeezing a partially inflated ballon in your fist pop out in all sorts of directions that quickly become bulges and… Tend to suffer “run away” effects.

It’s why seals are often made as quite thick O-Rings that sit in a machined grove that is only 9/10ths of the thickness of the O-Ring.

However as the Space Shuttle showed, with the tang/clevis joint of the SRB if the plate pressure(tang) does not remain uniform (on the clevis) then the O-Ring will fail in a semi-predictable way (in the case of the space shuttle the tange rotated making failure all the more easy).

You can see this described quite well in Diane Vaughan’s book “The Challenger Launch Decision” (of which I have a UK First Edition). It’s usually in my dead tree cave, but it’s out at the moment with my son who is reading it as part of his Aerospace training over the summer.

Otherwise I’d give you the diagram and page number 😉

As for the bulk of the “official report”, there is an expression of,

“Fit only for perforation.”

That implies that the pages should be perfed into toilet paper sized pieces such that it is at least some use as “BumFodder”…

ResearcherZero August 9, 2024 3:21 AM

“People who feel the pain of our failures are not included in the conversation.”

‘https://www.wired.com/story/undisruptable27-us-critical-infrastructure-cybersecurity/

ResearcherZero August 9, 2024 6:25 AM

Feeling a liitle paranoid?

‘https://www.psychologytoday.com/us/blog/escaping-our-mental-traps/202307/paranoia-the-irresistible-urge-to-suspect

Rossmann will no longer buy Tesla’s electric vehicles for its fleet, effective immediately.

‘https://electrek.co/2024/08/06/tesla-loses-corporate-sales-over-elon-musk-tesla-mission/

Unsold Teslas are reportedly piling up in storage lots across the country.
https://prospect.org/power/2024-06-17-elon-musk-decline-of-tesla/

Don’t trust localhost, not even locally.

‘https://www.oligo.security/blog/0-0-0-0-day-exploiting-localhost-apis-from-the-browser

Exploiting update flow to downgrade updates and bypass VBS UEFI Lock.

‘https://www.safebreach.com/blog/downgrade-attacks-using-windows-updates

Rontea August 9, 2024 11:31 AM

“chatbots that advise users to commit suicide”

I didn’t know about this tragic outcome and and I am not sure if the company that created the chatbot should be held liable for a lawsuit or how section 230 applies. Should the author(AI or human) be liable for the content they produce or will they have a disclaimer in legalese. It seems like this needs to be legislated.

JonKnowsNothing August 11, 2024 12:50 AM

@Clive

re: Even the simple things fall

A couple of recent MSM reports on specific failures in design causing serious injuries and deaths. These designs look great on paper and on display but hold a hidden “who thought this was a good idea?” fatal flaws.

  • Samsung to recall 1.1M stoves with Front Facing Knobs that almost turn themselves on. There is a replacement set of knobs that require an extra squeeze to turn them.
    • Extra squeeze may stop the current flaw where the knob easily turns on the appliance, but likely doesn’t do much for particular groups of people with disabilities or infirmaries. One bad idea leads to more bad ideas.
  • Redesign of Car On The Steering Wheel cruise controls that are “touch based” capacitive controls and not button-click based. Turning the wheel the driver can “brush” against the touch based controls activating the cruise control system and initiating cruise states.
    • The car’s black box did not register an accident and the airbags did not deploy, presumably due to the lower speeds involved.
    • Looks great in the flash ads, but having your car jump into cruise control at “last setting” while making a turn in urban traffic probably wasn’t what folks had in mind when admiring the James Bond look at the dealership.

We cannot even create a knob to turn on an oven safely. We design new failure points at every iteration. Trying to “keep up” is not going to be easy, and STOP isn’t in the vocabulary either.

Winter August 13, 2024 5:02 AM

@Clive

Why necessitated, well in part it’s “Smart Grid” nonsense, but is what lies behind them. It’s because in the UK the Government are “pulling out gas” from peoples homes because it’s a risk (supposedly as a kick over from Grenfell Tower disaster).

I am not in the UK, so I cannot comment on the internal politics there.

In the Netherlands, we have a shortage of power lines due to a rising need for electricity. This is indeed part of a shift away from petrol and natural gas. The move to wind and solar energy is an important part of this policy.

Given our own gas reserves have been closed (earthquakes were destroying cities in the production area) and our main supplier has sworn to destroy us, getting off gas is really important.

To prevent the peaks of supply and demand to burn out our grid, “Smart Grids” are a necessity. Smart Grids allow better pricing of power to lower the peaks.

As for “Carbon emissions”, the Netherlands has become a wine country, the Rhine and Meuse are running dry in summer and flood us in winter, Spain is running out of water, the twelve months ending in July 2024 were globally the hottest by month ever recorded. So, yes, reducing greenhouse gasses is an important policy goal to keep Europe habitable.

[1] ‘https://www.rijkswaterstaat.nl/en/news/archive/2023/12/future-discharges-in-the-rhine-and-meuse-lower-in-summer-and-higher-in-winter

ResearcherZero August 21, 2024 12:59 AM

90 claims of successful supply chain attacks in the last 6 months

‘https://cyble.com/blog/surge-in-software-supply-chain-attacks-heightens-third-party-vigilance/

Leave a comment

Blog moderation policy

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.