More Spectre/Meltdown-Like Attacks

Back in January, we learned about a class of vulnerabilities against microprocessors that leverages various performance and efficiency shortcuts for attack. I wrote that the first two attacks would be just the start:

It shouldn’t be surprising that microprocessor designers have been building insecure hardware for 20 years. What’s surprising is that it took 20 years to discover it. In their rush to make computers faster, they weren’t thinking about security. They didn’t have the expertise to find these vulnerabilities. And those who did were too busy finding normal software vulnerabilities to examine microprocessors. Security researchers are starting to look more closely at these systems, so expect to hear about more vulnerabilities along these lines.

Spectre and Meltdown are pretty catastrophic vulnerabilities, but they only affect the confidentiality of data. Now that they—and the research into the Intel ME vulnerability—have shown researchers where to look, more is coming—and what they’ll find will be worse than either Spectre or Meltdown. There will be vulnerabilities that will allow attackers to manipulate or delete data across processes, potentially fatal in the computers controlling our cars or implanted medical devices. These will be similarly impossible to fix, and the only strategy will be to throw our devices away and buy new ones.

We saw several variants over the year. And now researchers have discovered seven more.

Researchers say they’ve discovered the seven new CPU attacks while performing “a sound and extensible systematization of transient execution attacks”—a catch-all term the research team used to describe attacks on the various internal mechanisms that a CPU uses to process data, such as the speculative execution process, the CPU’s internal caches, and other internal execution stages.

The research team says they’ve successfully demonstrated all seven attacks with proof-of-concept code. Experiments to confirm six other Meltdown-attacks did not succeed, according to a graph published by researchers.

Microprocessor designers have spent the year rethinking the security of their architectures. My guess is that they have a lot more rethinking to do.

Posted on November 14, 2018 at 3:30 PM29 Comments

Comments

Alan November 14, 2018 4:19 PM

These will be similarly impossible to fix, and the only strategy will be to throw our devices away and buy new ones.

Some might be fixable in software, firmware or microcode. Designers might want to also think about how they can make their architectures patchable or upgradable, although the tradeoff is that would also add another potential malware vector.

Phaete November 14, 2018 4:58 PM

I don’t think they need more ‘rethinking’
Just go back to the basics, all the vulnerabilities were in the extra garbage.
And for gods sake, get rid of that x86 bus architecture and get some decent point to point serial lines like alpha.
I like my CPUs to be raw workhorses. I rather add more then have them use fancy vulnerable tricks.

echo November 14, 2018 5:00 PM

The first discoveries werea a big “nee ner, hah hah” then began turning into a list. Now the exploits are so many and varied they beginning to develop their own taxonomy. My eyes have officially glazed over.

I am so glad this isn’t my problem to worry about. On the plus side building solutions gives people something constructive to work towards.

I don’t perceive most business class computers needing more processing power. The idea of switchable purpose CPUs has entered the mitigation debate and I’m guessing this concept could be extended to encompass form factors and add on boards. My basic idea is once a baseline has been established you can focus solely on security in the same way a lot of new physics has switched from discovering to refinement. The so called “circular economy” is also in the news now which may make people revist the idea of build quality and modularity, and for those who need more “power” such as media professionals and gamers this can be provided by a more quick and dirty add on board or dock or similar.

Most of the threats I perceive boil down to “sanitise your data”. I was taught this in college a lifetime ago and it stuck.

Hubert November 14, 2018 5:10 PM

Some might be fixable in software, firmware or microcode.

Software can disable speculation, caching, and hyperthreading. While it’s way too expensive for general use, expect to see options to allow “sensitive” processes to pay these costs; I wouldn’t be surprised to see GPG do so when operating with a long-term key. Linux patches already exist to automatically do some semi-costly flushing for non-dumpable processes:
https://lwn.net/Articles/764209/

Designers might want to also think about how they can make their architectures patchable or upgradable, although the tradeoff is that would also add another potential malware vector.

The vector is already there: https://hackaday.com/2017/12/28/34c3-hacking-into-a-cpus-microcode/

Otto November 14, 2018 5:25 PM

“My guess is that they have a lot more rethinking to do.”

Not a single one of the new architectures designed with Specter-class vulnerabilities in mind has left the design boards. It’s been only a year! We won’t see “hardened” architectures in at least another couple more years, likely more. From concept to market it normally takes from 3 to 5 years. All the currently available solutions are firmware and OS fixes, not architectural fixes.

“all the vulnerabilities were in the extra garbage”

That “extra garbage” is what allows clock speeds in the order of GHz. Also, funny you should mention Alpha, what made Alpha famous in the 90s (with their 21264) was their massive out-of-order pipeline (yes, with speculative execution).

Rach El November 14, 2018 6:41 PM

Why hasn’t anyone been prosecuted? Not rhetoric but a straight question. Should people at Intel have been prosecuted?

echo November 14, 2018 7:26 PM

@Rach El

We know from leaks that Intel managment knew then did nothing and likely directly or indirectly threatened workers who rocked the boat. This implies Intel may have been making fraudulent claims for their products fitness for purpose. I cannot comment on the US but within the context of UK law I am aware of cases where lawyers litigating a civil case went as far as to threaten suing individual people within an organisation for fraud.

Unfortunately I cannot comment for privacy reasons and because I’m trying to bring a case but I know in the UK one local government very definately has ignored complaints and warning signals and compliance auditors reports which have resulted in UK citizens being deprived of their entitlements and in some cases contributed to their unlawful deaths. I am also aware the police look the other way. There is a now decade old National Audit Office report which ties all this together which everyone has forgotten about which proves in essence they knew and did nothing. Nobody important has ever been challenged. Only a very small handful of low level admin officers were fired. I have email evidence that city councillors either blackholed problems or were wilfully blind or negligent. In one instance a city councillor was actually promoted to the committee overseeing their statutory obligations after having earlier denying any council responsibility on top of council officers similar denials when social services and also financial disbursements under wellbeing regulations were the councils responsibility! Is anyone interested? No! The last email I have is a city councillour haranguing me for attacking the local council who were doing a wonderful job! I’m sorry but I thought my local councillor was elected to serve the interests of the people not a bureaucracy caught red handed?

Should they be prosecuted? Will they be prosecuted? These are two different questions aren’t they?

Jason November 14, 2018 8:56 PM

Vulnerability or deliberate exploit thanks to the NSA? My guess is the latter. It is in the interest of national security that systems be weakened for attack. Intel, AMD all got “the memo”, just like the RSA Security company RSA was paid $10 million to use the flawed Dual_EC_DRBG pseudorandom number generating algorithm. Our only option is to dump Intel, AMD, ARM and hope that we can get computers using RISC-V and SiFive Risc chips.

Phaete November 15, 2018 4:53 AM

@Otto
That “extra garbage” is what allows clock speeds in the order of GHz.
Nope, speculative reading is NOT what allows clock speeds in the order of Ghz.
Good silicon and production process is what allows Ghz speeds in ICs (not only CPUs go Ghz)

Speculative reading might let your CPU process the data earlier, making it seem faster, but it will not affect clock speeds.

Wiki about clock speed:
The clock rate of a CPU is normally determined by the frequency of an oscillator crystal.Typically a crystal oscillator produces a fixed sine wave—the frequency reference signal. Electronic circuitry translates that into a square wave at the same frequency for digital electronics applications etc….

ATN November 15, 2018 4:59 AM

which proves in essence they knew and did nothing.

Everybody knows, and nobody do something about it: it costs money.

The header of any “free” source file contains text like:
* is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The End User Agreement of any commercial software contains even worse text, saying usually they can execute anything whatsoever on your computer and send anything back on the Internet. Nobody (even company lawers) accept even to read the contract they are entering before accepting to pay/use the provided goods.

People/companies demand that behaviour if it reduces the price they pay in cash.
So there is no need to have a secure processor if the software “do not care”.

Even the justice systems seems to ignore the multiple licenses (i.e. right to modify data even when they tell you they do not modify data) contained in a new Windows PC (BIOS, video card, keyboard, mouse, SD-card reader, …) and any licenses for any USB device driver which has ever been plugged-in – by considering a PC is trustworthly if it has an “antivirus”.

The whole system, as it is now, means that if you produce bug-free software and hardware, you will loose your job very quickly – and will never be paid a penny more for the increased quality of such software/hardware.

Rj Brown November 15, 2018 7:00 AM

I am more concerned about embedded safety critical applications than desktop applications. The stuff I work on can affect the safety of property, lives, and even national security. My challenge in the face of this is to implement a secure system using insecure components. I am reminded of Seymour Cray’s famous discussion of how he had to build faster computers with unreliable or defective parts. When working on avionics software under DO178, one has a document called the coding standard. It describes coding rules. It says what you cannot do, even if it is legal in the language, because some quirk or bug in the particular compiler toolchain you are using doesn’t handle it in a safely testable manner, or the hardware doesn’t handle it properly. The compiler must be qualified as a tool. Your coding standard is designed for that particular compiler executing on the actual hardware you will be shipping with the device. These devices can control airfoil surfaces of the plane, and a failure can be deadly.

Otto November 15, 2018 7:56 AM

@Phaete I’m afraid processors are a bit more complex that just “how fast can you switch the state of a transistor”. If you make a chip running at 3 GHz but it spends 5 out of every 6 cycles waiting for a branch condition to resolve, it’s pointless to run it at more than 500 MHz. Remove the caches (which is another point of contention for side channel attacks) and you’re down even further. Remember, 1 every 5-6 instructions is a branch instruction, 1 every 3-4 instructions is a memory access. If you don’t predict, speculate and execute out of order with 1-2 ns memory accesses, your GHz-class machine will run as fast as a MHz-class chip, at which point it’s just cheaper and more efficient to just build MHz-class systems.

There’s just no way around it. Keeping a processor busy at those frequencies requires a lot of risky tricks. That doesn’t mean we can’t build the CPUs speculative, out-of-order with caches and make them Spectre-free. It’s costly (both in transistors, power and performance), but it’s still orders of magnitud better than going back to 80s/early 90s designs. Let’s keep it in this millennium.

Phaete November 15, 2018 10:17 AM

@Otto
Yes, i know all of that and even more like RISC vs CISC in these scenarios, but those discussions are beyond the scope here.
It still does not change the facts that speculative reading does not affect clock speed.

And if you want to keep stuff in this millennium, don’t use an x86 architecture.

kendra November 15, 2018 11:11 AM

If you don’t predict, speculate and execute out of order with 1-2 ns memory accesses, your GHz-class machine will run as fast as a MHz-class chip

It’s not exactly true that there’s no way around this. Data fetches need to happen in advance, somehow, but CPU-controlled speculation isn’t the only way. Itanium was different, but also a failure (maybe because of Intel keeping its ISA and its best compiler proprietary). The Mill puts this stuff under software control, but has the notable handicap of not actually existing. I do expect we’ll eventually see another approach have some success. If the compiler were in control we’d still have bugs, but they could be a lot easier to fix.

Otto November 15, 2018 12:12 PM

@Kendra, Itanium did have speculation (and prediction) mechanisms that you had to use in order to avoid massive stalls. True, you can decide not to use them, but you can also flush your speculation structures on each context switch and you end up not worse than the worst case and much better in the general case (I still believe we can do better than this).

I’m not sure I’m making my point clear, but I’ll try once more: the root of problem isn’t the complexity of the hardware, it’s the nature of the software. If somebody finds a way to remove branches (and memory accesses) from software or a way to resolve them faster, then we can remove speculation (and caches) without losing unacceptable levels of performance. We know of applications that fit this (i.e.: streaming applications) and we design dedicated hardware for this purpose, but that’s just a tiny portion of all the applications we need to run in our computers. Don’t forget: hardware serves software.

Clive Robinson November 15, 2018 3:54 PM

@ Otto,

If somebody finds a way to remove branches (and memory accesses) from software or a way to resolve them faster, then we can remove speculation (and caches) without losing unacceptable levels of performance.

They already have in a way. You increase the number and size of registers even more than Vector Processors and you use the very local memory that was used inefficiently for caching as local core RAM for both execution and data. You then strip your software down into “tasklets” that fit within those constraints. The CPU can thus be risk based and will run with upto a 10GHz clock. You can get quite a few on the same size chips Intel are currently using and if you get your parallel programing right it will do many tasks very efficiently.

The big problem is the so called “thin streak of sequentialness behind the key board” way to many programers think in series and many computer languages are designed to support that way of thinking… Even Intel know it’s a lost cause hence the number of CPU cores they stick on high end chips. Throw in a couple or sixteen FPGA’s as well and the current bottle neck would have been solved if not for sequential thinking…

Antong November 15, 2018 7:54 PM

As the nature of us as ‘human’ who are behind inside Intel or whoever they are will never built the secure architecture as we dream of.

Hence it is almost impossible to built one that as secure as our expectation, moreover we love performance of processor and less care on security to be honest.

Such of demand or need of speed, etc that lead to the architecture of CPU somehow mutate from hardware to software. Where basically software more vulnerable then hardware, isn’t?

I guess that what @Phaete’s comment was.

Anders November 16, 2018 3:50 AM

Any similar vulnerabilities in RISC-V?

How widespread is RISC-V now, have it entered into mainstream/business computing yet?

Luzugaz Fenyev-Baixar November 16, 2018 12:19 PM

Given the many guises under in which similar flaws appear, it might be wondered if there were not a general quantitative-qualitative theory that explalns them all as particular instances of a single concept, much like the way Conley Index theory provides a unified account of the behavior of general dynamical systems.

wumpus November 16, 2018 2:12 PM

@Anders

Problems are due to explicit chips, not the architecture. Any RISC-V with decent performance that wasn’t built from the ground up to avoid spectre-type issues will have them (those built from the ground up to avoid such things probably still will have them anyway).

The older, slow ARM chips will make it difficult to find such issues (although since they rely on branch prediction, it is still possible). About the only thing that is free from such issues is something like an Atmel or PIC chip that is below 200MHz or so.

Erdem Memisyazici November 18, 2018 5:47 AM

Transient is a wonderful way to put it. Imagine a doctor who works on 50 heart surgeries in a row, and he makes the assumption that the 51st must also be a heart surgery operation so he operates without reading the patient’s pad. Although for most cases this may provide the benefit of getting a higher performance from your CPU, that benefit comes with the risk of being exploited. There doesn’t seem to be a way to gracefully exit that performance boost without always executing at least one possibly malicious instruction. Software industry can lower the risk of exploitation but not eliminate it so the latest OS versions will be slower but worth the added security.

john doe November 18, 2018 3:20 PM

“What’s surprising is that it took 20 years to discover it.”

On May 8, 1995, a paper called “The Intel 80×86 Processor Architecture: Pitfalls for Secure Systems” published at the 1995 IEEE Symposium on Security and Privacy warned against a covert timing channel in the CPU cache and translation lookaside buffer (TLB).[33] This analysis was performed under the auspices of the National Security Agency’s Trusted Products Evaluation Program (TPEP).

Clive Robinson November 18, 2018 4:18 PM

@ john doe,

On May 8, 1995, a paper…

It was not exactly new then either. Time based side channels had been known for some time before that. If memory serves Seymour Cray made comment on the subject when talking about some of his work (though it was not called “time based side channels” back then, that terminology kind of started in the 1970’s via Gus Simmons).

The supprising thing is realy how long it has taken to “exploit it”.

That is few security specialists had any kind of knowledge below the ISA level in the computing stack. They just kind of assumed it was all OK below, having worked at the microcode level, the RTL level and at the logic gate and below level, I was more than aware of issues. So like other hardware engineers “we knew” but “said little”. Amongst so many issues we tried to protect against what we thought was most likely, thus the reason parity checking on memory was a introduced to try to fix a “class” of issues. But most hardware engineers were only to aware of how badly designed tgings were, we knew DMA was a major security hole, as was dual processor “shared memory”, we just assumed the other side of the ISA gulf guys knew as well. Thus that the OS and compiler guys would use the MMU etc judiciously to mitigate…

At any point somebody could have not just worked out but published an exploit, it just did not happen, nobody gave the snowball the nudge. Now they have it’s heading down the valley getting bigger and bigger and is becoming a bit of an avalanche effect…

As my son is fond of saying “That’s the way it be Bro”…

Men in Black November 21, 2018 6:38 PM

Chinese police, Interpol, Russian and N/S Korean security services, and other law enforcement agencies operating in the jurisdiction where the computer chips are actually made, do not want “unbreakable” end-user security.

They already run rings around NSA with their goals of foreign surveillance.

Americans cannot even keep a small bank account open online against foreign nation-state attack. And they laugh all the way to the Swiss banks after they have pulled all our teeth and pawned off all our personal and private information to foreign consumer credit bureaus.

moops November 22, 2018 2:03 AM

move to more computing done on accelerators and make the superscalar control processors dumber. Better for the power budget. There is no out-of-order on an accelerator, or branch predictors, or latency timing tricks. The devices are simple SIMT drones or FPGA finite state machines or systolic arrays. On typical GPGPU devices the hardware drops in nops to keep the code from diverging. Much harder to snoop out a side channel.

Wesley Parish November 23, 2018 3:27 AM

Just a new twist on the old twist; courtesy of ElReg

3 is the magic number (of bits): Flip ’em at once and your ECC protection can be Rowhammer’d
https://www.theregister.co.uk/2018/11/21/rowhammer_ecc_server_protection/

But if three bits could be changed simultaneously, ECC would not catch the modification. This much people have known about, though the key thing here is that it can be shown to allow Rowhammer attacks through.

As easy as One Two Three … happy happy joy joy!

Clive Robinson November 23, 2018 8:18 AM

@ Wesley Parish,

From the El Reg article,

    The findings are significant because while ECC was once considered a reliable method for thwarting Rowhammer-style attacks, it was thought to be theoretically possible to bypass the defense mechanism. Now an attack has been demonstrated.

In other words it was just a question of time… Because “The laws of physics alow”, and “Human ingenuity knows no bounds when the incentives are right for ‘hinky thinking'”…

The point about ECC, parity checking and similar “Error Detecting / correcting” codes is that like most other hashes they are not perfect, though they might be complex.

If you think about it, when you write a word to memory you write directly to the data bits cells, but with error detection and correction you also at the same time write to a logic circuit that “eventually” writes to the error check bits. Thus there is a write penalty to pay that with tricks can only “on average” be reduced.

Likewise when you read a word from “error check” memory you read directly from the data bit cells, that also go at the same time go along with the error check bits to a logic circuit that “eventually” produces a verification signal. It’s an even longer “eventually” when you add in correcting as well as,checking…

Thus as with “time” short cuts in the DRAM design, the error checking and correcting circuits are designed to make that “eventually” as short as possible to keep memory performance high.

When you do things like that without the right knowledge –which might not come along till later– then you are more likely than not to open up a security flaw (from “Security-v-Efficiency” issue). Which just leves the question of “Can somebody exploit it” which in this case is yes, but currently not as easy as it probably will get.

All of these “hiden” hardware issues that are poping up are not realy “hidden” more “forgoton” or “not much talked about” for ovet a quater of a century as I’ve mentioned before.

Believe it or not they actually go back over half a century to the design of “carry circuits” in adders, which are about the slowest instructions in the design of an ALU. Even today people are writting the occasional paper about the problems with “word wide carry” not just in ALU’s but MMU’s and even DMA controlers. All of which have interesting little “side channels” that can leak information or be abused in some way if checks are not put in to prevent them.

The problem with such checks is they also take time… so you end up with a Catch-22 type problem, where to go any faster you “should” add checks but that slowes things down again, takes up more real estate, burns more power, generates more heat etc etc for no “performance” benefit. Thus the “So why bother?” question gets asked, followed fairly shortly there after by the old “Aftet all, it’s only a theoretical risk” excuse…

As has been noted even the dumbest of chickens come home to roost…

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.