Entries Tagged "machine learning"

Page 5 of 5

Visiting the NSA

Yesterday, I visited the NSA. It was Cyber Command’s birthday, but that’s not why I was there. I visited as part of the Berklett Cybersecurity Project, run out of the Berkman Klein Center and funded by the Hewlett Foundation. (BERKman hewLETT—get it? We have a web page, but it’s badly out of date.)

It was a full day of meetings, all unclassified but under the Chatham House Rule. Gen. Nakasone welcomed us and took questions at the start. Various senior officials spoke with us on a variety of topics, but mostly focused on three areas:

  • Russian influence operations, both what the NSA and US Cyber Command did during the 2018 election and what they can do in the future;
  • China and the threats to critical infrastructure from untrusted computer hardware, both the 5G network and more broadly;
  • Machine learning, both how to ensure a ML system is compliant with all laws, and how ML can help with other compliance tasks.

It was all interesting. Those first two topics are ones that I am thinking and writing about, and it was good to hear their perspective. I find that I am much more closely aligned with the NSA about cybersecurity than I am about privacy, which made the meeting much less fraught than it would have been if we were discussing Section 702 of the FISA Amendments Act, Section 215 the USA Freedom Act (up for renewal next year), or any 4th Amendment violations. I don’t think we’re past those issues by any means, but they make up less of what I am working on.

Posted on May 22, 2019 at 2:11 PMView Comments

Maliciously Tampering with Medical Imagery

In what I am sure is only a first in many similar demonstrations, researchers are able to add or remove cancer signs from CT scans. The results easily fool radiologists.

I don’t think the medical device industry has thought at all about data integrity and authentication issues. In a world where sensor data of all kinds is undetectably manipulatable, they’re going to have to start.

Research paper. Slashdot thread.

Posted on April 12, 2019 at 11:13 AMView Comments

Adversarial Machine Learning against Tesla's Autopilot

Researchers have been able to fool Tesla’s autopilot in a variety of ways, including convincing it to drive into oncoming traffic. It requires the placement of stickers on the road.

Abstract: Keen Security Lab has maintained the security research work on Tesla vehicle and shared our research results on Black Hat USA 2017 and 2018 in a row. Based on the ROOT privilege of the APE (Tesla Autopilot ECU, software version 18.6.1), we did some further interesting research work on this module. We analyzed the CAN messaging functions of APE, and successfully got remote control of the steering system in a contact-less way. We used an improved optimization algorithm to generate adversarial examples of the features (autowipers and lane recognition) which make decisions purely based on camera data, and successfully achieved the adversarial example attack in the physical world. In addition, we also found a potential high-risk design weakness of the lane recognition when the vehicle is in Autosteer mode. The whole article is divided into four parts: first a brief introduction of Autopilot, after that we will introduce how to send control commands from APE to control the steering system when the car is driving. In the last two sections, we will introduce the implementation details of the autowipers and lane recognition features, as well as our adversarial example attacking methods in the physical world. In our research, we believe that we made three creative contributions:

  1. We proved that we can remotely gain the root privilege of APE and control the steering system.
  2. We proved that we can disturb the autowipers function by using adversarial examples in the physical world.
  3. We proved that we can mislead the Tesla car into the reverse lane with minor changes on the road.

You can see the stickers in this photo. They’re unobtrusive.

This is machine learning’s big problem, and I think solving it is a lot harder than many believe.

Posted on April 4, 2019 at 6:18 AMView Comments

Machine Learning to Detect Software Vulnerabilities

No one doubts that artificial intelligence (AI) and machine learning (ML) will transform cybersecurity. We just don’t know how, or when. While the literature generally focuses on the different uses of AI by attackers and defenders ­ and the resultant arms race between the two ­ I want to talk about software vulnerabilities.

All software contains bugs. The reason is basically economic: The market doesn’t want to pay for quality software. With a few exceptions, such as the space shuttle, the market prioritizes fast and cheap over good. The result is that any large modern software package contains hundreds or thousands of bugs.

Some percentage of bugs are also vulnerabilities, and a percentage of those are exploitable vulnerabilities, meaning an attacker who knows about them can attack the underlying system in some way. And some percentage of those are discovered and used. This is why your computer and smartphone software is constantly being patched; software vendors are fixing bugs that are also vulnerabilities that have been discovered and are being used.

Everything would be better if software vendors found and fixed all bugs during the design and development process, but, as I said, the market doesn’t reward that kind of delay and expense. AI, and machine learning in particular, has the potential to forever change this trade-off.

The problem of finding software vulnerabilities seems well-suited for ML systems. Going through code line by line is just the sort of tedious problem that computers excel at, if we can only teach them what a vulnerability looks like. There are challenges with that, of course, but there is already a healthy amount of academic literature on the topic—and research is continuing. There’s every reason to expect ML systems to get better at this as time goes on, and some reason to expect them to eventually become very good at it.

Finding vulnerabilities can benefit both attackers and defenders, but it’s not a fair fight. When an attacker’s ML system finds a vulnerability in software, the attacker can use it to compromise systems. When a defender’s ML system finds the same vulnerability, he or she can try to patch the system or program network defenses to watch for and block code that tries to exploit it.

But when the same system is in the hands of a software developer who uses it to find the vulnerability before the software is ever released, the developer fixes it so it can never be used in the first place. The ML system will probably be part of his or her software design tools and will automatically find and fix vulnerabilities while the code is still in development.

Fast-forward a decade or so into the future. We might say to each other, “Remember those years when software vulnerabilities were a thing, before ML vulnerability finders were built into every compiler and fixed them before the software was ever released? Wow, those were crazy years.” Not only is this future possible, but I would bet on it.

Getting from here to there will be a dangerous ride, though. Those vulnerability finders will first be unleashed on existing software, giving attackers hundreds if not thousands of vulnerabilities to exploit in real-world attacks. Sure, defenders can use the same systems, but many of today’s Internet of Things systems have no engineering teams to write patches and no ability to download and install patches. The result will be hundreds of vulnerabilities that attackers can find and use.

But if we look far enough into the horizon, we can see a future where software vulnerabilities are a thing of the past. Then we’ll just have to worry about whatever new and more advanced attack techniques those AI systems come up with.

This essay previously appeared on SecurityIntelligence.com.

Posted on January 8, 2019 at 6:13 AMView Comments

Using Machine Learning to Create Fake Fingerprints

Researchers are able to create fake fingerprints that result in a 20% false-positive rate.

The problem is that these sensors obtain only partial images of users’ fingerprints—at the points where they make contact with the scanner. The paper noted that since partial prints are not as distinctive as complete prints, the chances of one partial print getting matched with another is high.

The artificially generated prints, dubbed DeepMasterPrints by the researchers, capitalize on the aforementioned vulnerability to accurately imitate one in five fingerprints in a database. The database was originally supposed to have only an error rate of one in a thousand.

Another vulnerability exploited by the researchers was the high prevalence of some natural fingerprint features such as loops and whorls, compared to others. With this understanding, the team generated some prints that contain several of these common features. They found that these artificial prints were more likely to match with other prints than would be normally possible.

If this result is robust—and I assume it will be improved upon over the coming years—it will make the current generation of fingerprint readers obsolete as secure biometrics. It also opens a new chapter in the arms race between biometric authentication systems and fake biometrics that can fool them.

More interestingly, I wonder if similar techniques can be brought to bear against other biometrics are well.

Research paper.

Slashdot thread

Posted on November 23, 2018 at 6:11 AMView Comments

Identifying Programmers by Their Coding Style

Fascinating research on de-anonymizing code—from either source code or compiled code:

Rachel Greenstadt, an associate professor of computer science at Drexel University, and Aylin Caliskan, Greenstadt’s former PhD student and now an assistant professor at George Washington University, have found that code, like other forms of stylistic expression, are not anonymous. At the DefCon hacking conference Friday, the pair will present a number of studies they’ve conducted using machine learning techniques to de-anonymize the authors of code samples. Their work could be useful in a plagiarism dispute, for instance, but it also has privacy implications, especially for the thousands of developers who contribute open source code to the world.

Posted on August 13, 2018 at 4:02 PMView Comments

Detecting Phishing Sites with Machine Learning

Really interesting article:

A trained eye (or even a not-so-trained one) can discern when something phishy is going on with a domain or subdomain name. There are search tools, such as Censys.io, that allow humans to specifically search through the massive pile of certificate log entries for sites that spoof certain brands or functions common to identity-processing sites. But it’s not something humans can do in real time very well—which is where machine learning steps in.

StreamingPhish and the other tools apply a set of rules against the names within certificate log entries. In StreamingPhish’s case, these rules are the result of guided learning—a corpus of known good and bad domain names is processed and turned into a “classifier,” which (based on my anecdotal experience) can then fairly reliably identify potentially evil websites.

Posted on August 9, 2018 at 6:17 AMView Comments

1 3 4 5

Sidebar photo of Bruce Schneier by Joe MacInnis.