Failure Modes in Machine Learning
Interesting taxonomy of machine-learning failures (pdf) that encompasses both mistakes and attacks, or — in their words — intentional and unintentional failure modes. It’s a good basis for threat modeling.
Page 1 of 2
Interesting taxonomy of machine-learning failures (pdf) that encompasses both mistakes and attacks, or — in their words — intentional and unintentional failure modes. It’s a good basis for threat modeling.
I have written a paper with Orin Kerr on encryption workarounds. Our goal wasn’t to make any policy recommendations. (That was a good thing, since we probably don’t agree on any.) Our goal was to present a taxonomy of different workarounds, and discuss their technical and legal characteristics and complications.
Abstract: The widespread use of encryption has triggered a new step in many criminal investigations: the encryption workaround. We define an encryption workaround as any lawful government effort to reveal an unencrypted version of a target’s data that has been concealed by encryption. This essay provides an overview of encryption workarounds. It begins with a taxonomy of the different ways investigators might try to bypass encryption schemes. We classify six kinds of workarounds: find the key, guess the key, compel the key, exploit a flaw in the encryption software, access plaintext while the device is in use, and locate another plaintext copy. For each approach, we consider the practical, technological, and legal hurdles raised by its use.
The remainder of the essay develops lessons about encryption workarounds and the broader public debate about encryption in criminal investigations. First, encryption workarounds are inherently probabilistic. None work every time, and none can be categorically ruled out every time. Second, the different resources required for different workarounds will have significant distributional effects on law enforcement. Some techniques are inexpensive and can be used often by many law enforcement agencies; some are sophisticated or expensive and likely to be used rarely and only by a few. Third, the scope of legal authority to compel third-party assistance will be a continuing challenge. And fourth, the law governing encryption workarounds remains uncertain and underdeveloped. Whether encryption will be a game-changer or a speed bump depends on both technological change and the resolution of important legal questions that currently remain unanswered.
The paper is finished, but we’ll be revising it once more before final publication. Comments are appreciated.
I’m a big fan of taxonomies, and this — from Carnegie Mellon — seems like a useful one:
The taxonomy of operational cyber security risks, summarized in Table 1 and detailed in this section, is structured around a hierarchy of classes, subclasses, and elements. The taxonomy has four main classes:
- actions of people — action, or lack of action, taken by people either deliberately or accidentally that impact cyber security
- systems and technology failures — failure of hardware, software, and information systems
- failed internal processes — problems in the internal business processes that impact the ability to implement, manage, and sustain cyber security, such as process design, execution, and control
- external events — issues often outside the control of the organization, such as disasters, legal issues, business issues, and service provider dependencies
Each of these four classes is further decomposed into subclasses, and each subclass is described by its elements.
I found this article on the difference between threats and vulnerabilities to be very interesting. I like his taxonomy.
Roger Grimes has an article describing “the seven types of malicious hackers.” I generally like taxonomies, and this one is pretty good.
He says the seven types are:
Lately I’ve been reading about user security and privacy — control, really — on social networking sites. The issues are hard and the solutions harder, but I’m seeing a lot of confusion in even forming the questions. Social networking sites deal with several different types of user data, and it’s essential to separate them.
Below is my taxonomy of social networking data, which I first presented at the Internet Governance Forum meeting last November, and again — revised — at an OECD workshop on the role of Internet intermediaries in June.
There are other ways to look at user data. Some of it you give to the social networking site in confidence, expecting the site to safeguard the data. Some of it you publish openly and others use it to find you. And some of it you share only within an enumerated circle of other users. At the receiving end, social networking sites can monetize all of it: generally by selling targeted advertising.
Different social networking sites give users different rights for each data type. Some are always private, some can be made private, and some are always public. Some can be edited or deleted — I know one site that allows entrusted data to be edited or deleted within a 24-hour period — and some cannot. Some can be viewed and some cannot.
It’s also clear that users should have different rights with respect to each data type. We should be allowed to export, change, and delete disclosed data, even if the social networking sites don’t want us to. It’s less clear what rights we have for entrusted data — and far less clear for incidental data. If you post pictures from a party with me in them, can I demand you remove those pictures — or at least blur out my face? (Go look up the conviction of three Google executives in Italian court over a YouTube video.) And what about behavioral data? It’s frequently a critical part of a social networking site’s business model. We often don’t mind if a site uses it to target advertisements, but are less sanguine when it sells data to third parties.
As we continue our conversations about what sorts of fundamental rights people have with respect to their data, and more countries contemplate regulation on social networking sites and user data, it will be important to keep this taxonomy in mind. The sorts of things that would be suitable for one type of data might be completely unworkable and inappropriate for another.
This essay previously appeared in IEEE Security & Privacy.
Edited to add: this post has been translated into Portuguese.
Seems there are a lot of them. They do it for marketing purposes. Really, they seem to do it because the code base they use does it automatically or just because they can. (Initial reports that an Android wallpaper app was malicious seems to have been an overstatement; they’re just incompetent: inadvertently collecting more data than necessary.)
Meanwhile, there’s now an Android rootkit available.
This blog entry should serve as a model for open and transparent security self-reporting. I’m impressed.
The researchers say they’ve found a vulnerability in U.S. law enforcement wiretaps, if only theoretical, that would allow a surveillance target to thwart the authorities by launching what amounts to a denial-of-service (DoS) attack against the connection between the phone company switches and law enforcement.
[…]
The University of Pennsylvania researchers found the flaw after examining the telecommunication industry standard ANSI Standard J-STD-025, which addresses the transmission of wiretapped data from telecom switches to authorities, according to IDG News Service. Under the 1994 Communications Assistance for Law Enforcement Act, or Calea, telecoms are required to design their network architecture to make it easy for authorities to tap calls transmitted over digitally switched phone networks.
But the researchers, who describe their findings in a paper, found that the standard allows for very little bandwidth for the transmission of data about phone calls, which can be overwhelmed in a DoS attack. When a wiretap is enabled, the phone company’s switch establishes a 64-Kbps Call Data Channel to send data about the call to law enforcement. That paltry channel can be flooded if a target of the wiretap sends dozens of simultaneous SMS messages or makes numerous VOIP phone calls “without significant degradation of service to the targets’ actual traffic.”
As a result, the researchers say, law enforcement could lose records of whom a target called and when. The attack could also prevent the content of calls from being accurately monitored or recorded.
The paper. Comments by Matt Blaze, one of the paper’s authors.
At the Internet Governance Forum in Sharm El Sheikh this week, there was a conversation on social networking data. Someone made the point that there are several different types of data, and it would be useful to separate them. This is my taxonomy of social networking data.
Different social networking sites give users different rights for each data type. Some are always private, some can be made private, and some are always public. Some can be edited or deleted — I know one site that allows entrusted data to be edited or deleted within a 24-hour period — and some cannot. Some can be viewed and some cannot.
And people should have different rights with respect to each data type. It’s clear that people should be allowed to change and delete their disclosed data. It’s less clear what rights they have for their entrusted data. And far less clear for their incidental data. If you post pictures of a party with me in them, can I demand you remove those pictures — or at least blur out my face? And what about behavioral data? It’s often a critical part of a social networking site’s business model. We often don’t mind if they use it to target advertisements, but are probably less sanguine about them selling it to third parties.
As we continue our conversations about what sorts of fundamental rights people have with respect to their data, this taxonomy will be useful.
EDITED TO ADD (12/12): Another categorization centered on destination instead of trust level.
Sidebar photo of Bruce Schneier by Joe MacInnis.