Entries Tagged "risk assessment"

Page 1 of 22

The CrowdStrike Outage and Market-Driven Brittleness

Friday’s massive internet outage, caused by a mid-sized tech company called CrowdStrike, disrupted major airlines, hospitals, and banks. Nearly 7,000 flights were canceled. It took down 911 systems and factories, courthouses, and television stations. Tallying the total cost will take time. The outage affected more than 8.5 million Windows computers, and the cost will surely be in the billions of dollars­—easily matching the most costly previous cyberattacks, such as NotPetya.

The catastrophe is yet another reminder of how brittle global internet infrastructure is. It’s complex, deeply interconnected, and filled with single points of failure. As we experienced last week, a single problem in a small piece of software can take large swaths of the internet and global economy offline.

The brittleness of modern society isn’t confined to tech. We can see it in many parts of our infrastructure, from food to electricity, from finance to transportation. This is often a result of globalization and consolidation, but not always. In information technology, brittleness also results from the fact that hundreds of companies, none of which you’ve heard of, each perform a small but essential role in keeping the internet running. CrowdStrike is one of those companies.

This brittleness is a result of market incentives. In enterprise computing—as opposed to personal computing—a company that provides computing infrastructure to enterprise networks is incentivized to be as integral as possible, to have as deep access into their customers’ networks as possible, and to run as leanly as possible.

Redundancies are unprofitable. Being slow and careful is unprofitable. Being less embedded in and less essential and having less access to the customers’ networks and machines is unprofitable—at least in the short term, by which these companies are measured. This is true for companies like CrowdStrike. It’s also true for CrowdStrike’s customers, who also didn’t have resilience, redundancy, or backup systems in place for failures such as this because they are also an expense that affects short-term profitability.

But brittleness is profitable only when everything is working. When a brittle system fails, it fails badly. The cost of failure to a company like CrowdStrike is a fraction of the cost to the global economy. And there will be a next CrowdStrike, and one after that. The market rewards short-term profit-maximizing systems, and doesn’t sufficiently penalize such companies for the impact their mistakes can have. (Stock prices depress only temporarily. Regulatory penalties are minor. Class-action lawsuits settle. Insurance blunts financial losses.) It’s not even clear that the information technology industry could exist in its current form if it had to take into account all the risks such brittleness causes.

The asymmetry of costs is largely due to our complex interdependency on so many systems and technologies, any one of which can cause major failures. Each piece of software depends on dozens of others, typically written by other engineering teams sometimes years earlier on the other side of the planet. Some software systems have not been properly designed to contain the damage caused by a bug or a hack of some key software dependency.

These failures can take many forms. The CrowdStrike failure was the result of a buggy software update. The bug didn’t get caught in testing and was rolled out to CrowdStrike’s customers worldwide. Sometimes, failures are deliberate results of a cyberattack. Other failures are just random, the result of some unforeseen dependency between different pieces of critical software systems.

Imagine a house where the drywall, flooring, fireplace, and light fixtures are all made by companies that need continuous access and whose failures would cause the house to collapse. You’d never set foot in such a structure, yet that’s how software systems are built. It’s not that 100 percent of the system relies on each company all the time, but 100 percent of the system can fail if any one of them fails. But doing better is expensive and doesn’t immediately contribute to a company’s bottom line.

Economist Ronald Coase famously described the nature of the firm­—any business­—as a collection of contracts. Each contract has a cost. Performing the same function in-house also has a cost. When the costs of maintaining the contract are lower than the cost of doing the thing in-house, then it makes sense to outsource: to another firm down the street or, in an era of cheap communication and coordination, to another firm on the other side of the planet. The problem is that both the financial and risk costs of outsourcing can be hidden—delayed in time and masked by complexity—and can lead to a false sense of security when companies are actually entangled by these invisible dependencies. The ability to outsource software services became easy a little over a decade ago, due to ubiquitous global network connectivity, cloud and software-as-a-service business models, and an increase in industry- and government-led certifications and box-checking exercises.

This market force has led to the current global interdependence of systems, far and wide beyond their industry and original scope. It’s why flying planes depends on software that has nothing to do with the avionics. It’s why, in our connected internet-of-things world, we can imagine a similar bad software update resulting in our cars not starting one morning or our refrigerators failing.

This is not something we can dismantle overnight. We have built a society based on complex technology that we’re utterly dependent on, with no reliable way to manage that technology. Compare the internet with ecological systems. Both are complex, but ecological systems have deep complexity rather than just surface complexity. In ecological systems, there are fewer single points of failure: If any one thing fails in a healthy natural ecosystem, there are other things that will take over. That gives them a resilience that our tech systems lack.

We need deep complexity in our technological systems, and that will require changes in the market. Right now, the market incentives in tech are to focus on how things succeed: A company like CrowdStrike provides a key service that checks off required functionality on a compliance checklist, which makes it all about the features that they will deliver when everything is working. That’s exactly backward. We want our technological infrastructure to mimic nature in the way things fail. That will give us deep complexity rather than just surface complexity, and resilience rather than brittleness.

How do we accomplish this? There are examples in the technology world, but they are piecemeal. Netflix is famous for its Chaos Monkey tool, which intentionally causes failures to force the systems (and, really, the engineers) to be more resilient. The incentives don’t line up in the short term: It makes it harder for Netflix engineers to do their jobs and more expensive for them to run their systems. Over years, this kind of testing generates more stable systems. But it requires corporate leadership with foresight and a willingness to spend in the short term for possible long-term benefits.

Last week’s update wouldn’t have been a major failure if CrowdStrike had rolled out this change incrementally: first 1 percent of their users, then 10 percent, then everyone. But that’s much more expensive, because it requires a commitment of engineer time for monitoring, debugging, and iterating. And can take months to do correctly for complex and mission-critical software. An executive today will look at the market incentives and correctly conclude that it’s better for them to take the chance than to “waste” the time and money.

The usual tools of regulation and certification may be inadequate, because failure of complex systems is inherently also complex. We can’t describe the unknown unknowns involved in advance. Rather, what we need to codify are the processes by which failure testing must take place.

We know, for example, how to test whether cars fail well. The National Highway Traffic Safety Administration crashes cars to learn what happens to the people inside. But cars are relatively simple, and keeping people safe is straightforward. Software is different. It is diverse, is constantly changing, and has to continually adapt to novel circumstances. We can’t expect that a regulation that mandates a specific list of software crash tests would suffice. Again, security and resilience are achieved through the process by which we fail and fix, not through any specific checklist. Regulation has to codify that process.

Today’s internet systems are too complex to hope that if we are smart and build each piece correctly the sum total will work right. We have to deliberately break things and keep breaking them. This repeated process of breaking and fixing will make these systems reliable. And then a willingness to embrace inefficiencies will make these systems resilient. But the economic incentives point companies in the other direction, to build their systems as brittle as they can possibly get away with.

This essay was written with Barath Raghavan, and previously appeared on Lawfare.com.

Posted on July 25, 2024 at 2:37 PMView Comments

New SEC Rules around Cybersecurity Incident Disclosures

The US Securities and Exchange Commission adopted final rules around the disclosure of cybersecurity incidents. There are two basic rules:

  1. Public companies must “disclose any cybersecurity incident they determine to be material” within four days, with potential delays if there is a national security risk.
  2. Public companies must “describe their processes, if any, for assessing, identifying, and managing material risks from cybersecurity threats” in their annual filings.

The rules go into effect this December.

In an email newsletter, Melissa Hathaway wrote:

Now that the rule is final, companies have approximately six months to one year to document and operationalize the policies and procedures for the identification and management of cybersecurity (information security/privacy) risks. Continuous assessment of the risk reduction activities should be elevated within an enterprise risk management framework and process. Good governance mechanisms delineate the accountability and responsibility for ensuring successful execution, while actionable, repeatable, meaningful, and time-dependent metrics or key performance indicators (KPI) should be used to reinforce realistic objectives and timelines. Management should assess the competency of the personnel responsible for implementing these policies and be ready to identify these people (by name) in their annual filing.

News article.

Posted on August 2, 2023 at 7:04 AMView Comments

On the Catastrophic Risk of AI

Earlier this week, I signed on to a short group statement, coordinated by the Center for AI Safety:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The press coverage has been extensive, and surprising to me. The New York Times headline is “A.I. Poses ‘Risk of Extinction,’ Industry Leaders Warn.” BBC: “Artificial intelligence could lead to extinction, experts warn.” Other headlines are similar.

I actually don’t think that AI poses a risk to human extinction. I think it poses a similar risk to pandemics and nuclear war—which is to say, a risk worth taking seriously, but not something to panic over. Which is what I thought the statement said.

In my talk at the RSA Conference last month, I talked about the power level of our species becoming too great for our systems of governance. Talking about those systems, I said:

Now, add into this mix the risks that arise from new and dangerous technologies such as the internet or AI or synthetic biology. Or molecular nanotechnology, or nuclear weapons. Here, misaligned incentives and hacking can have catastrophic consequences for society.

That was what I was thinking about when I agreed to sign on to the statement: “Pandemics, nuclear weapons, AI—yeah, I would put those three in the same bucket. Surely we can spend the same effort on AI risk as we do on future pandemics. That’s a really low bar.” Clearly I should have focused on the word “extinction,” and not the relative comparisons.

Seth Lazar, Jeremy Howard, and Arvind Narayanan wrote:

We think that, in fact, most signatories to the statement believe that runaway AI is a way off yet, and that it will take a significant scientific advance to get there­—ne that we cannot anticipate, even if we are confident that it will someday occur. If this is so, then at least two things follow.

I agree with that, and with their follow up:

First, we should give more weight to serious risks from AI that are more urgent. Even if existing AI systems and their plausible extensions won’t wipe us out, they are already causing much more concentrated harm, they are sure to exacerbate inequality and, in the hands of power-hungry governments and unscrupulous corporations, will undermine individual and collective freedom.

This is what I wrote in Click Here to Kill Everybody (2018):

I am less worried about AI; I regard fear of AI more as a mirror of our own society than as a harbinger of the future. AI and intelligent robotics are the culmination of several precursor technologies, like machine learning algorithms, automation, and autonomy. The security risks from those precursor technologies are already with us, and they’re increasing as the technologies become more powerful and more prevalent. So, while I am worried about intelligent and even driverless cars, most of the risks arealready prevalent in Internet-connected drivered cars. And while I am worried about robot soldiers, most of the risks are already prevalent in autonomous weapons systems.

Also, as roboticist Rodney Brooks pointed out, “Long before we see such machines arising there will be the somewhat less intelligent and belligerent machines. Before that there will be the really grumpy machines. Before that the quite annoying machines. And before them the arrogant unpleasant machines.” I think we’ll see any new security risks coming long before they get here.

I do think we should worry about catastrophic AI and robotics risk. It’s the fact that they affect the world in a direct, physical manner—and that they’re vulnerable to class breaks.

(Other things to read: David Chapman is good on scary AI. And Kieran Healy is good on the statement.)

Okay, enough. I should also learn not to sign on to group statements.

EDITED TO ADD (9/9): The Brooks quote is from this excellent essay.

Posted on June 1, 2023 at 7:17 AMView Comments

VPNs and Trust

TorrentFreak surveyed nineteen VPN providers, asking them questions about their privacy practices: what data they keep, how they respond to court order, what country they are incorporated in, and so on.

Most interesting to me is the home countries of these companies. Express VPN is incorporated in the British Virgin Islands. NordVPN is incorporated in Panama. There are VPNs from the Seychelles, Malaysia, and Bulgaria. There are VPNs from more Western and democratic countries like the US, Switzerland, Canada, and Sweden. Presumably all of those companies follow the laws of their home country.

And it matters. I’ve been thinking about this since Trojan Shield was made public. This is the joint US/Australia-run encrypted messaging service that lured criminals to use it, and then spied on everything they did. Or, at least, Australian law enforcement spied on everyone. The FBI wasn’t able to because the US has better privacy laws.

We don’t talk about it a lot, but VPNs are entirely based on trust. As a consumer, you have no idea which company will best protect your privacy. You don’t know the data protection laws of the Seychelles or Panama. You don’t know which countries can put extra-legal pressure on companies operating within their jurisdiction. You don’t know who actually owns and runs the VPNs. You don’t even know which foreign companies the NSA has targeted for mass surveillance. All you can do is make your best guess, and hope you guessed well.

Posted on June 16, 2021 at 6:17 AMView Comments

Presidential Cybersecurity and Pelotons

President Biden wants his Peloton in the White House. For those who have missed the hype, it’s an Internet-connected stationary bicycle. It has a screen, a camera, and a microphone. You can take live classes online, work out with your friends, or join the exercise social network. And all of that is a security risk, especially if you are the president of the United States.

Any computer brings with it the risk of hacking. This is true of our computers and phones, and it’s also true about all of the Internet-of-Things devices that are increasingly part of our lives. These large and small appliances, cars, medical devices, toys and—yes—exercise machines are all computers at their core, and they’re all just as vulnerable. Presidents face special risks when it comes to the IoT, but Biden has the NSA to help him handle them.

Not everyone is so lucky, and the rest of us need something more structural.

US presidents have long tussled with their security advisers over tech. The NSA often customizes devices, but that means eliminating features. In 2010, President Barack Obama complained that his presidential BlackBerry device was “no fun” because only ten people were allowed to contact him on it. In 2013, security prevented him from getting an iPhone. When he finally got an upgrade to his BlackBerry in 2016, he complained that his new “secure” phone couldn’t take pictures, send texts, or play music. His “hardened” iPad to read daily intelligence briefings was presumably similarly handicapped. We don’t know what the NSA did to these devices, but they certainly modified the software and physically removed the cameras and microphones—and possibly the wireless Internet connection.

President Donald Trump resisted efforts to secure his phones. We don’t know the details, only that they were regularly replaced, with the government effectively treating them as burner phones.

The risks are serious. We know that the Russians and the Chinese were eavesdropping on Trump’s phones. Hackers can remotely turn on microphones and cameras, listening in on conversations. They can grab copies of any documents on the device. They can also use those devices to further infiltrate government networks, maybe even jumping onto classified networks that the devices connect to. If the devices have physical capabilities, those can be hacked as well. In 2007, the wireless features of Vice President Richard B. Cheney’s pacemaker were disabled out of fears that it could be hacked to assassinate him. In 1999, the NSA banned Furbies from its offices, mistakenly believing that they could listen and learn.

Physically removing features and components works, but the results are increasingly unacceptable. The NSA could take Biden’s Peloton and rip out the camera, microphone, and Internet connection, and that would make it secure—but then it would just be a normal (albeit expensive) stationary bike. Maybe Biden wouldn’t accept that, and he’d demand that the NSA do even more work to customize and secure the Peloton part of the bicycle. Maybe Biden’s security agents could isolate his Peloton in a specially shielded room where it couldn’t infect other computers, and warn him not to discuss national security in its presence.

This might work, but it certainly doesn’t scale. As president, Biden can direct substantial resources to solving his cybersecurity problems. The real issue is what everyone else should do. The president of the United States is a singular espionage target, but so are members of his staff and other administration officials.

Members of Congress are targets, as are governors and mayors, police officers and judges, CEOs and directors of human rights organizations, nuclear power plant operators, and election officials. All of these people have smartphones, tablets, and laptops. Many have Internet-connected cars and appliances, vacuums, bikes, and doorbells. Every one of those devices is a potential security risk, and all of those people are potential national security targets. But none of those people will get their Internet-connected devices customized by the NSA.

That is the real cybersecurity issue. Internet connectivity brings with it features we like. In our cars, it means real-time navigation, entertainment options, automatic diagnostics, and more. In a Peloton, it means everything that makes it more than a stationary bike. In a pacemaker, it means continuous monitoring by your doctor—and possibly your life saved as a result. In an iPhone or iPad, it means…well, everything. We can search for older, non-networked versions of some of these devices, or the NSA can disable connectivity for the privileged few of us. But the result is the same: in Obama’s words, “no fun.”

And unconnected options are increasingly hard to find. In 2016, I tried to find a new car that didn’t come with Internet connectivity, but I had to give up: there were no options to omit that in the class of car I wanted. Similarly, it’s getting harder to find major appliances without a wireless connection. As the price of connectivity continues to drop, more and more things will only be available Internet-enabled.

Internet security is national security—not because the president is personally vulnerable but because we are all part of a single network. Depending on who we are and what we do, we will make different trade-offs between security and fun. But we all deserve better options.

Regulations that force manufacturers to provide better security for all of us are the only way to do that. We need minimum security standards for computers of all kinds. We need transparency laws that give all of us, from the president on down, sufficient information to make our own security trade-offs. And we need liability laws that hold companies liable when they misrepresent the security of their products and services.

I’m not worried about Biden. He and his staff will figure out how to balance his exercise needs with the national security needs of the country. Sometimes the solutions are weirdly customized, such as the anti-eavesdropping tent that Obama used while traveling. I am much more worried about the political activists, journalists, human rights workers, and oppressed minorities around the world who don’t have the money or expertise to secure their technology, or the information that would give them the ability to make informed decisions on which technologies to choose.

This essay previously appeared in the Washington Post.

Posted on February 5, 2021 at 5:58 AMView Comments

The Legal Risks of Security Research

Sunoo Park and Kendra Albert have published “A Researcher’s Guide to Some Legal Risks of Security Research.”

From a summary:

Such risk extends beyond anti-hacking laws, implicating copyright law and anti-circumvention provisions (DMCA §1201), electronic privacy law (ECPA), and cryptography export controls, as well as broader legal areas such as contract and trade secret law.

Our Guide gives the most comprehensive presentation to date of this landscape of legal risks, with an eye to both legal and technical nuance. Aimed at researchers, the public, and technology lawyers alike, its aims both to provide pragmatic guidance to those navigating today’s uncertain legal landscape, and to provoke public debate towards future reform.

Comprehensive, and well worth reading.

Here’s a Twitter thread by Kendra.

Posted on October 30, 2020 at 9:14 AMView Comments

On Risk-Based Authentication

Interesting usability study: “More Than Just Good Passwords? A Study on Usability and Security Perceptions of Risk-based Authentication“:

Abstract: Risk-based Authentication (RBA) is an adaptive security measure to strengthen password-based authentication. RBA monitors additional features during login, and when observed feature values differ significantly from previously seen ones, users have to provide additional authentication factors such as a verification code. RBA has the potential to offer more usable authentication, but the usability and the security perceptions of RBA are not studied well.

We present the results of a between-group lab study (n=65) to evaluate usability and security perceptions of two RBA variants, one 2FA variant, and password-only authentication. Our study shows with significant results that RBA is considered to be more usable than the studied 2FA variants, while it is perceived as more secure than password-only authentication in general and comparably se-cure to 2FA in a variety of application types. We also observed RBA usability problems and provide recommendations for mitigation.Our contribution provides a first deeper understanding of the users’perception of RBA and helps to improve RBA implementations for a broader user acceptance.

Paper’s website. I’ve blogged about risk-based authentication before.

Posted on October 5, 2020 at 11:47 AMView Comments

1 2 3 22

Sidebar photo of Bruce Schneier by Joe MacInnis.