Attacking the Performance of Machine Learning Systems

Interesting research: “Sponge Examples: Energy-Latency Attacks on Neural Networks“:

Abstract: The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While such devices enable us to train large-scale neural networks in datacenters and deploy them on edge devices, their designers’ focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully-crafted sponge examples, which are inputs designed to maximise energy consumption and latency, to drive machine learning (ML) systems towards their worst-case performance. Sponge examples are, to our knowledge, the first denial-of-service attack against the ML components of such systems. We mount two variants of our sponge attack on a wide range of state-of-the-art neural network models, and find that language models are surprisingly vulnerable. Sponge examples frequently increase both latency and energy consumption of these models by a factor of 30×. Extensive experiments show that our new attack is effective across different hardware platforms (CPU, GPU and an ASIC simulator) on a wide range of different language tasks. On vision tasks, we show that sponge examples can be produced and a latency degradation observed, but the effect is less pronounced. To demonstrate the effectiveness of sponge examples in the real world, we mount an attack against Microsoft Azure’s translator and show an increase of response time from 1ms to 6s (6000×). We conclude by proposing a defense strategy: shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.

Attackers were able to degrade the performance so much, and force the system to waste so many cycles, that some hardware would shut down due to overheating. Definitely a “novel threat vector.”

Tags: academic papers, cyberattack, machine learning

Posted on June 16, 2022 at 6:02 AM • 9 Comments

Comments

John • June 16, 2022 9:51 AM

Hmm….

Cheaper to hire a real human being!

John

Jeadly • June 16, 2022 9:52 AM

I believe this attack vector was pioneered in 1996 on Seinfeld:

Kramer: George has gotta be happy about this.

Jerry (indifferent): Yeah, yeah, yeah…

Jerry: Oh my God, Kramer, is that woman just wearing a bra?

(Sue Ellen is seen walking down the street)

Kramer: Oh, mama.

Jerry: Kramer!!!

(Car crashes into a lamp post)

Ted • June 16, 2022 9:55 AM

Congrats to the researchers for getting sponge examples added to the MITRE attack framework for AI security! I believe it’s under Cost Harvesting? The risks denial-of-service attack against ML systems should not be ignored.

https://atlas.mitre.org/techniques/AML.T0034

I’m secretly hoping to find the crazy list of energy-consuming inputs sent to the NLPs. So the 50-character inputs sent to Microsoft Azure’s translator were groups of the Frankenstein-words created from the genetic algorithm? I like the example of the genetic algorithm in the paper.

The paper is additionally interesting because it does provide some insights into NLP processes.

BCS • June 16, 2022 1:56 PM

It’s not clear from the quote; is this an attack on the training phase or the use of the final model? Both would be interesting but for much different reasons.

Luc • June 16, 2022 6:09 PM

PDF can be found here https://arxiv.org/abs/2006.03463

David Leppik • June 16, 2022 6:24 PM

See also: Bruce’s blogging on this 2020 research on the same kind of attack.

As I summarized back then, neural networks are “feed forward”: they simulate circuitry with no loops—so they can’t get into infinite loops like regular programs. This makes it harder to overwhelm them.

Neural network hardware relies on the fact that, for most input, the useful parts of networks consist of sparse matrixes. So a little hardware can simulate huge matrixes—using loops and memory caches.

The trick of this kind of attack is to give ambiguous input to the network, maximizing cache misses. It’s like doing a DDOS on a search engine by combining all the most obscure keywords in a single query.

Ted • June 16, 2022 9:34 PM

@BCS

You will like this video about the research:

https://m.youtube.com/watch?v=6M_T_-im7PY

James L. Dean • July 15, 2022 9:56 AM

Have inputs been found that have a similar effect on the human brain?

Winter • July 15, 2022 10:14 AM

@James L. Dean

Have inputs been found that have a similar effect on the human brain?

I would say TV, Sports, Religion, and Politics often have the same effects. The number of person-years that have been wasted on any of these with nothing to show for is monumental for some people.

Schneier on Security

Attacking the Performance of Machine Learning Systems

Comments

Leave a comment Cancel reply