Hacking a Gene Sequencer by Encoding Malware in a DNA Strand

One of the common ways to hack a computer is to mess with its input data. That is, if you can feed the computer data that it interprets—or misinterprets—in a particular way, you can trick the computer into doing things that it wasn’t intended to do. This is basically what a buffer overflow attack is: the data input overflows a buffer and ends up being executed by the computer process.

Well, some researchers did this with a computer that processes DNA, and they encoded their malware in the DNA strands themselves:

To make the malware, the team translated a simple computer command into a short stretch of 176 DNA letters, denoted as A, G, C, and T. After ordering copies of the DNA from a vendor for $89, they fed the strands to a sequencing machine, which read off the gene letters, storing them as binary digits, 0s and 1s.

Erlich says the attack took advantage of a spill-over effect, when data that exceeds a storage buffer can be interpreted as a computer command. In this case, the command contacted a server controlled by Kohno’s team, from which they took control of a computer in their lab they were using to analyze the DNA file.

News articles. Research paper.

Posted on August 15, 2017 at 6:00 AM39 Comments

Comments

TimH August 15, 2017 9:04 AM

@Fed: No, they exploited the flaw on purpose. Yes, because analyasis machine’s results are de facto treated as fact by courts that can have you killed.

Rico August 15, 2017 9:07 AM

Yes they created the flaw and then created the input data to demo the flaw and yet it was still trivial. Hey tomorrow i’ll write some insecure code of my own and run it on a sewing machine so that it is somehow unique and “novel”. This seems to be newsworthy for some reason…

Federico August 15, 2017 9:09 AM

@Tim sorry, I forgot that you guys still have death penalty (not that life in prison is much better).

Still “[we targeted a program] that we modified to include a known vulnerability” to me doesn’t not sound like “exploiting on purpose”, but “exploiting something we put there on purpose”. I can concede though that my English is not top notch, as it is only my second language, so I might be misunderstanding.

Impossibly Stupid August 15, 2017 9:45 AM

While this isn’t an earth shattering proof of concept, it does add yet another data point to the argument that it’s not such a great idea to have general purpose computers everywhere. At least not from a security perspective. And hooking them all together a la IoT only makes matters worse.

Anura August 15, 2017 9:47 AM

@Federico

It’s a proof of concept that shows that the DNA can be used as a vector for malware transmission. While their exploit itself is designed specifically for this proof of concept, it can potentially be used with many different possible flaws. I highly doubt that the people who code gene sequencers put much thought into vulnerabilities.

D-503 August 15, 2017 10:23 AM

@Anura
It’s a proof of concept that data can be used as a vector for malware transmission.
To me, an interesting part of this story is how IDT’s prices have continued to drop.
By the way, that DNA sequence was particularly hard to chemically synthesize b/c of repeats, and because of runs of the same base (“letter”).
Something akin to Moore’s Law has applied to DNA technology for many years, but we’re still far from being able to make a pet dinosaur genome from scratch in the test tube.

wumpus August 15, 2017 10:37 AM

While a buffer overflow sounds unreasonable in DNA, who is to say that it isn’t possible to manufacture a DNA strand longer than the shortest buffer shipping in DNA sequencers? Programmers working on DNA sequencers are unlikely to be programming as if on “Satan’s computer” (to use Bruce Schneier’s terminology) and may well assume that the longest natural DNA sequence is the longest expected sequence.

As a real attack, this is pretty pointless (it has to be vastly easier to get physical access or suborn a tech than to attack the machine this way). But a demonstration that anything written in C/C++ or otherwise contains patterns that allow buffer overlows should be considered insecure and quite potentially pwned, no matter what they do. Certainly anything written in such a way to take buffer overflows from anywhere shouldn’t be used in a court of law.

Alan Bostick August 15, 2017 10:43 AM

Cartoon idea: Kid and parent talking to an angry pediatrician. The kid has grotesque appendages growing out of his neck and shoulders. The parent is saying, “Why yes. We call him ‘Little Bobby Tables….'”

thoromyr August 15, 2017 10:58 AM

While their malware relied on complicit code pre-existing on the computer that does not mean the vector is not real. It seems that they took a shortcut and instead of finding something vulnerable to leverage they created something vulnerable in order for the exploit to work.

Lets put this another way: in principle it is possible to compromise a web camera just by showing it a picture. The “trick” is in forcing the camera’s sensors to a special input that will result in a buffer overflow in its jpeg/mpeg encoding.

Or, as more recently reported, inducing vibrations to cause a device’s sensors to provide unexpected input for processing. To my knowledge no exploit has actually been found that can be leveraged by this attack, but the principal is sound: inputs should never be trusted.

At present it seems to be a very manufactured molehill hyped into something bigger. But hopefully such attention will raise awareness about the need to never trust inputs.

Chris Miller August 15, 2017 10:59 AM

“may well assume that the longest natural DNA sequence is the longest expected sequence.”

Well, the longest natural DNA sequence in a human is ~249 million bp (chr1), and given that standard Illumina machines sequence 150bp reads at max… 😛

That aside, yes, this is a silly proof of concept, if for no other reason than because virtually all scientific software is riddled with security holes. It’s not designed to defend against attack vectors, because a) there’s very little that’s public facing, and b) very little that’s worth stealing.

AJWM August 15, 2017 11:19 AM

By the way, that DNA sequence was particularly hard to chemically synthesize b/c of repeats, and because of runs of the same base (“letter”).

This. It’s easy to forget (because nearly every source presents it in simplified form) that the abstraction of DNA as a simple base-4 sequence that codes an amino acid every three digits is just that: an abstraction.

In the real world DNA is a molecule with messy analog properties like regions of positive and negative charge which want to distort the molecule. Too many of the wrong kind and the molecule either breaks apart or binds itself into knots that the enzymes in a sequencer can’t handle.

There are ways around it (as this experiment demonstrates) but it isn’t easy. (Probably involves another layer of abstraction, like so much else in computer science.)

BTW, the idea of specifically coded DNA tags to hack a detector isn’t new. Five years ago (2012) I published (as part of my SF novel The Reticuli Deception) the idea of specifically designed DNA tags (inserted via retrovirus) to bypass certain identity checks in routine DNA identification scans. The idea is similar to the way in which the “constellation EURion” (Omron rings) will prevent color copiers from copying banknotes. (That’s not the main focus of the book; it’s a few throwaway scenes to explain how an agent evades detection. That, plus fingerprint modification and sub-corneal filters to distort his iris and retina patterns enough to fool a standard scanner.)

AJWM August 15, 2017 11:32 AM

the principal is sound: inputs should never be trusted.

As anyone who has ever seen an optical illusion should understand. 😉 Magicians routinely hack the human visual system.

Wael August 15, 2017 12:12 PM

@AJWM,

Five years ago (2012) I published (as part of my SF novel The Reticuli Deception)

Looked at the novel you referenced. I’ll get a copy of this instead. It’s more geared towards my interest. Hmm! Looks like 3 books?

Clive Robinson August 15, 2017 1:04 PM

@ Anura,

I highly doubt that the people who code gene sequencers put much thought into vulnerabilities.

No. As somebody in managment etc would probably say if asked “Why bother it’ll never happen” or similar “last words”.

The point people realy need to wake up on is “chain of evidence”. If some vaguely smart legal eagle finds out it’s possible to modify the machines operation, then the company is dead in the water. Because not just will the evidence get kicked out on that case, but in short order many other cases will get appeals, with reasonable chance some sentences will get overturned. Which is good if you an innocent person serving life or sitting in issolation on death row. But if not innocent then it’s bad for society in general…

Clive Robinson August 15, 2017 1:20 PM

@ Alan Bostick,

That is realy sick, but it made me laugh, which on a day like today rates atleast a +3

@ London Commuters,

For fellow South Londoners and others who have to go through Waterloo Station, can I suggest rather than try to get home you find a Bar and go for a “lock in” atleast when you stager into work in the morning looking like something a cat has vomited up, you can blaim it all on South West Trains and Network Fail…

Me having got through the mess and problems at Waterloo along with the fact that my local station is also closed, and had to deal with our legal breterin this PM. I’ve let it be known that tommorow is a “Homer to do paperwork” (which might also involve a little wax figurine modelling and pin poking at vitals 😉

albert August 15, 2017 2:46 PM

That computers are hackable is a given. Buffer bounds checking is trivial, and should be part of any system that’s designed to handle input from untrusted (that is, any) sources. I never experienced a buffer overrun in VB, nor did I ever experience an error that I didn’t cause, in my code.

My big concern is the possibility of hacking gene sequences. Successful attacks could render the whole DNA analysis paradigm as useless as tits on a boar hog. Not to mention evil geniuses trying to create bespoke viruses for world domination.

Scientists tend to jump ahead of what they think they understand, while the capitalists are eager to push monetization at all costs.

@Clive,
“…(which might also involve a little wax figurine modelling and pin poking at vitals ;-)…”

Amazingly accurate little dolls could be made with 3D printers. Then you wouldn’t need to write names on them. Clothes can be included. The printer could even mark anatomically correct locations for various organs. I know that there’s s/w available that creates 3D from 2D images. That’s all you need:)

Say hi to Uncle Bob for me.

. .. . .. — ….

AJWM August 15, 2017 4:03 PM

@Wael

Thanks for your interest!

Hmm! Looks like 3 books? Yes, the one you linked to is the third of a trilogy.

(And apologies to our host. No spam intended.)

Wael August 15, 2017 4:15 PM

@AJWM,

And apologies to our host. No spam intended.

We’ll make it up. I’ll point out the security related matters in the books. Problem is the host posts some book titles and never says why! “Decoded” is an example!

Ryan August 15, 2017 8:53 PM

That threat remains more of a plot point in a Michael Crichton novel than one that should concern computational biologists. But as genetic sequencing is increasingly handled by centralized services often run by university labs that own the expensive gene sequencing equipment that DNA-borne malware trick becomes ever so slightly more realistic.

Anselm August 16, 2017 4:15 AM

One reason why the researchers didn’t exploit a COTS DNA analysis machine (which judging from what one hears about other medical-type embedded systems are probably chock-full of easily findable and exploitable holes) might be that the manufacturer of the machine would come down on them like a ton of bricks, courtesy of the DMCA.

It’s much more unlikely that you will be sued for exposing a security vulnerability that you put there yourself in the first place, the original software in question probably comes from some university department that has no interest in or money to waste on suing another university department, and the proof-of-concept is still valid.

Peter Smart August 16, 2017 5:06 AM

A defense would be to correlate multiple implementations – i.e. multiple labs running different software.

The good thing is that the problem is mostly a pure function on the input and thus it’s possible to run multiple implementations.

EVM August 16, 2017 7:34 AM

All of the CS/EE type folks speculating on how sequencers work is pretty hilarious. Especially the comment about the human genome and the longest read by an Illumina. I’m not a microbiologist, but I have hung out with a few. 🙂 DNA is never sequenced by reading the entire sequence from beginning to end. Samples are prepped that rip cells apart, including the DNA strands. Luckily you have lots of copies in order to get the DNA back together. So the sequencer is doing lots of reads, and then looking for reads that overlap, in order to piece the order of the larger sequence together. Imagine you’ve got 100 copies of a book (with no page #s) and you rip them all up, and then you need to reassemble one copy of the book. This is basically how a modern sequencer works. So the idea that you could make a physical DNA sequence that reliably exploits a vulnerability in a sequencer regardless of how the sample is processed is just laughable.

JG4 August 16, 2017 8:09 AM

@Max – your point about smallpox applies to worse biothreats published by Ken Alibek and others. those can be made by machines in the “wrong” hands or by machines in “right” hands that are repurposed by malware. I’ll leave to the forum the problem of deciding right and wrong. it should be noted that the definitions are variable along the axes of time, distance and networks, depend on who is making the determination and how much thermo-rectal cryptanalysis they’ve experienced in their time on the old blue marble

@Brilliant Pebble – your point is spot on. genetic diversity in populations is what enables a species to survive an evolutionary bottleneck. however, the survival of the machines is (currently) secondary to the survival of the makers. quite a few survived smallpox, plagues and various other epidemics, but would be wiped out by some of the bespoke diseases.

Martin Potter August 16, 2017 9:32 AM

The summary says :

After ordering copies of the DNA from a vendor for $89

No, I don’t think you get a “copy” of any DNA for $89. What you get is a test kit into which you place some DNA and then send it to the lab for analysis. The analysis consists of a (very) partial read-out of some of the DNA sequence of the sample you provided. The sequence is not even sequential but a bit from here and a bit from there — just the notable parts (for purposes of the test).

Some better explanation is required.

Me August 16, 2017 11:06 AM

@Martin

No, I think they bought a DNA sequence, this isn’t hard:
https://www.thermofisher.com/us/en/home/life-science/cloning/gene-synthesis/geneart-gene-synthesis.html

Thermofisher offers this service, and while I didn’t look at pricing (needed a sign up), their tutorial shows a price of about $450 for 3 items, so an $89 sequence isn’t beyond the pale.

@ Everyone
That said, this reminds me of a (ridiculous) episode of Bones where a serial killer hacked their network with kerf marks on a bone. My wife asked if that was possible, and I responded with yes, but he’d have to know the (likely custom) software better than everyone, without having access to it, and he’d likely need to know the orientation of the bone to a degree that isn’t humanly possible BEFORE it was placed on the table.

DNA makes that last issue less of a problem, though the first might still be a problem.

Otter August 16, 2017 4:18 PM

Given the number of news reports, sometimes with video, of cops slipping a baggie into the perps pocket, car, livingroom;
Given the news reports of cops helping themselves to the evidence locker for fun and profit;
Given the reports of expert witnesses and technicians with so much faith in the prosecutors’ ability to identify the criminal that they don’t bother to actually test the evidence;
Given Ryan’s citation of “centralized services often run by university labs”;
Any serious operation to influence a courtcase, subvert a research project, mislead a medical test, would limit its efforts to determining whom to bribe, threaten, befuddle.

Hacking is cool, but it is easier to buy the evidence or the testimony.

Randal August 16, 2017 6:12 PM

Wouldn’t A, G, C, and T be sufficient to store information as quantum qubits down the road?

225 August 17, 2017 1:19 AM

This is fun but stupid, it’s like saying I could hack a house through the light switch. Sure, maybe, if your light switch was way over engineered in a way that let you. It would be trivial to avoid a buffer over run when looking at an input like this.

To restate the point from the twitter – “it targeted one that we modified to contain a known vulnerability”

From here – “the principal is sound: inputs should never be trusted.” neither should memory, why even get out of bed in the morning if you might forget where you left your shoes.

Clive Robinson August 17, 2017 2:32 AM

@ 225,

This is fun but stupid, it’s like saying I could hack a house through the light switch.

That’s an argument from the “technical viewpoint”, which whilst important is only part of the story. You also need to consider the broader context, of where the argument might get used for other reasons.

As I noted above, “chain of custody” is very important to the legal proffession, and it has a much broader context than just who is holding a piece of evidence at any point in time due to the prosecutorial “burden of proof” of “beyond reasonable doubt”.

In forensics there is a “blind eye” principle by experts, because the science is beyond that which the law would once have allowed. That is testing has gone beyond visual inspection and measurment to the point where evidence is destroyed in order to quantify it for comparison.

Thus there is a requirment that tests have to be carried out in a known determanistic way with invarient results for the same input. If not then the prosecutorial burden can not be met.

This little trick pulls up the corner of the rug underneath which the “blind eye” principle resides. Becsuse it shows that the equipment result is not independent of it’s input. Thus whilst the machine may give repeatable output with the same input, it won’t give the same output to a different design of machine under all input.

As the difference between the two machines would not be one of precision but error by this machine, it is more than sufficient to meet the bar of “reasonable doubt”, thus the prosecutorial burden of “beyond reasonable doubt” is not met.

Whilst in the past this would not matter as the evidence would still be available for “re-testing” as those old basic tests were “non destructive”. DNA testing works by destroying part or in some cases all of the item of evidence, thus comparative testing by the defence would not be possible and legaly that is only acceptable if and only if the test method and implementation is beyond reproach. Which a machine that makes errors would not be…

Thus any trial where the machine is used would be stigmatized with doubt, which could lead to the quashing of previous convictions.

225 August 17, 2017 8:01 AM

@Clive Robinson that’s just silly too. I see this article as the repeated advice to watch out for buffer overflows and non-sanitized inputs, you see it as some silver bullet that will make DNA evidence inadmissible. If tomorrow every case that relies on DNA evidence is thrown out then you are correct.

Clive Robinson August 17, 2017 10:10 AM

@ 225,

…you see it as some silver bullet that will make DNA evidence inadmissible.

I would suggest you get out from behind your rose tinted glasses and have a look at tech history and the legal proffession. The high paid legal fraternity make their money by finding leverage that other lawyers don’t have the personnel to seek out.

But as for your “that’s just to silly” comment, you realy only need to look at the history of air gap crossing as played out with BadBios. The number of people who said that the idea of using inaudible data signalling was either impossible or not workable was immense. The thing is they could not be bothered to do a little research.

I amongst others here had not only reasoned it was possible 30 odd years ago we had built systems for data transfer and similar.

Now we have many people using covert audio channels… So there is a lesson there for you.

The first touch stone you should use is not what your gut from limited experience tells you, but if something is possible under the rules of physics. Your second touch stone should be to look at history for similar or comparable events and why they happened. Because whilst technology might change rapidly humans only evolve very very slowly, especialy when it comes to crime and avarice.

Any way it’s up to you to make your call the way you see it. Others will always disagree with you, such is humanity. The problem arises when they have history, science and even economics on their side.

Impossibly Stupid August 17, 2017 5:37 PM

The sad thing is, this is all already playing out in a very low-tech way as convictions are overturned and reinvestigated due to police fabricating body camera evidence, along with numerous incidents where questionable things happened when the cameras were “malfunctioning”. With back doors and hacks of all kinds possible with any sort of complex device, it’s getting harder and harder for a technically savvy citizen to sit on a jury and not raise a heaping helping of reasonable doubt when it comes to evaluating the evidence that is produced by these machines.

225 August 18, 2017 3:27 AM

@Clive Robinson

You are ignoring the impact of the chain of custody on evidence, and that DNA testing uses a tiny amount of the evidence swabbed up. I’m not saying there are no problems with DNA testing in law, I’m saying the problems have nothing to do with purchasing enough synthetically modified DNA to plant as evidence and hack a police computer via a sample. If you think its possible that your local forensics lab could possibly have downgraded their DNA analysis machine to make it vulnerable to a hack like this, and that they use a single test sample that could be synthetic, please write a strongly worded letter to your MP.

Clive Robinson August 18, 2017 5:16 AM

@ 225,

If you think its possible that your local forensics lab could possibly have downgraded their DNA analysis machine to make it vulnerable to a hack like this…

You are still looking at it the wrong way around.

Most of the computer progrms behind scientific equipment has not been written for security. Worse much of it is filled with edge and corner cases which vary from one manufacture from another manufacturer.

If as a lawyer I can find a corner case in the piece of equipment used by the local forensics lab compared to another manufacturers equipment it gives me a crack I can use in court to sow the seeds of doubt not just in the jury but judge as well. My client gets off and I not only get paid I get others with pockets full of money beating their way to my door.

We have seen this happen with celebrities going to a particular legal representative that “worked the sysyem” this way to get them off various driving charges.

As more and more technology gets into the legal process we will see more and more of it.

The real problem is not those with money buying their way out of trouble but the hamfisted way politicos, legislators, judges and juries will try to balance the scales of justice. In essence the rich will always find those who can come up with new tricks to get them off. The knock on effect is the scales will get “a thumb on the other side” to redress the balance, and this means that those with lack of knowledge or resources will pay the price of the redress.

The correct way to stop this is the proper design, production, testing and maintenance of the technology.

As the debacal with voting machines has shown there is a conflict of interest that will ensure it will not happen. The correct engineering techniques are expensive, and currently the drive is to reduce government spending, especially where justice is concerned. If you buy on the cheap you will get failures that ultimately cost way way more to society. The current crop of politicos will not care because the cheap and shody not just saves money today it ups the conviction rate so “win win today” the cost to individuals and society will not become apparent for half a decade. The politicals can then kick it into the long grass for another half decade or so. Thus they will have moved the problem onto somebody in the future which is yet another win for the politicals especialy if it’s the political opposition. Such short term thinking and responsability avoidence is why western society is in the mess it is in.

Mall Mosquito August 23, 2017 8:39 PM

they took control of a computer in their lab they were using to analyze the DNA file.

What about that expert witness who comes in court to testify about the probablity of the DNA match on that speck of dried blood or strand of hair that was vacuumed up from the floor of the crime scene?

Does it still constitute proof beyond a reasonable doubt?

Oh, noez! The searial killerz iz going to get away with the wicked d33ds!!11!

nanashi November 5, 2017 2:37 AM

Now I just need to write a DNA exploit and insert it into a few of my transposons, corrupting the memory state of an analysis machine so it is unable to match my real DNA, and no one can ever connect me to the scene of a crime… The possibilities are endless! Now just to figure out how to insert a new transposon into every cell in my body without dying of cancer…

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.