Countering "Trusting Trust"
Way back in 1974, Paul Karger and Roger Schell discovered a devastating attack against computer systems. Ken Thompson described it in his classic 1984 speech, “Reflections on Trusting Trust.” Basically, an attacker changes a compiler binary to produce malicious versions of some programs, INCLUDING ITSELF. Once this is done, the attack perpetuates, essentially undetectably. Thompson demonstrated the attack in a devastating way: he subverted a compiler of an experimental victim, allowing Thompson to log in as root without using a password. The victim never noticed the attack, even when they disassembled the binaries—the compiler rigged the disassembler, too.
This attack has long been part of the lore of computer security, and everyone knows that there’s no defense. And that makes this paper by David A. Wheeler so interesting. It’s “Countering Trusting Trust through Diverse Double-Compiling,” and here’s the abstract:
An Air Force evaluation of Multics, and Ken Thompson’s famous Turing award lecture “Reflections on Trusting Trust,” showed that compilers can be subverted to insert malicious Trojan horses into critical software, including themselves. If this attack goes undetected, even complete analysis of a system’s source code will not find the malicious code that is running, and methods for detecting this particular attack are not widely known. This paper describes a practical technique, termed diverse double-compiling (DDC), that detects this attack and some unintended compiler defects as well. Simply recompile the purported source code twice: once with a second (trusted) compiler, and again using the result of the first compilation. If the result is bit-for-bit identical with the untrusted binary, then the source code accurately represents the binary. This technique has been mentioned informally, but its issues and ramifications have not been identified or discussed in a peer-reviewed work, nor has a public demonstration been made. This paper describes the technique, justifies it, describes how to overcome practical challenges, and demonstrates it.
To see how this works, look at the attack. In a simple form, the attacker modifies the compiler binary so that whenever some targeted security code like a password check is compiled, the compiler emits the attacker’s backdoor code in the executable.
Now, this would be easy to get around by just recompiling the compiler. Since that will be done from time to time as bugs are fixed or features are added, a more robust form of of the attack adds a step: Whenever the compiler is itself compiled, it emits the code to insert malicious code into various programs, including itself.
Assuming broadly that the compiler source is updated, but not completely rewritten, this attack is undetectable.
Wheeler explains how to defeat this more robust attack. Suppose we have two completely independent compilers: A and T. More specifically, we have source code SA of compiler A, and executable code EA and ET. We want to determine if the binary of compiler A—EA—contains this trusting trust attack.
Here’s Wheeler’s trick:
Step 1: Compile SA with EA, yielding new executable X.
Step 2: Compile SA with ET, yielding new executable Y.
Since X and Y were generated by two different compilers, they should have different binary code but be functionally equivalent. So far, so good. Now:
Step 3: Compile SA with X, yielding new executable V.
Step 4: Compile SA with Y, yielding new executable W.
Since X and Y are functionally equivalent, V and W should be bit-for-bit equivalent.
And that’s how to detect the attack. If EA is infected with the robust form of the attack, then X and Y will be functionally different. And if X and Y are functionally different, then V and W will be bitwise different. So all you have to do is to run a binary compare between V and W; if they’re different, then EA is infected.
Now you might read this and think: “What’s the big deal? All I need to test if I have a trusted compiler is…another trusted compiler. Isn’t it turtles all the way down?”
Not really. You do have to trust a compiler, but you don’t have to know beforehand which one you must trust. If you have the source code for compiler T, you can test it against compiler A. Basically, you still have to have at least one executable compiler you trust. But you don’t have to know which one you should start trusting.
And the definition of “trust” is much looser. This countermeasure will only fail if both A and T are infected in exactly the same way. The second compiler can be malicious; it just has to be malicious in some different way: i.e., it can’t have the same triggers and payloads of the first. You can greatly increase the odds that the triggers/payloads are not identical by increasing diversity: using a compiler from a different era, on a different platform, without a common heritage, transforming the code, etc.
Also, the only thing compiler B has to do is compile the compiler-under-test. It can be hideously slow, produce code that is hideously slow, or only work on a machine that hasn’t been produced in a decade. You could create a compiler specifically for this task. And if you’re really worried about “turtles all the way down,” you can write Compiler B yourself for a computer you built yourself from vacuum tubes that you made yourself. Since Compiler B only has to occasionally recompile your “real” compiler, you can impose a lot of restrictions that you would never accept in a typical production-use compiler. And you can periodically check Compiler B’s integrity using every other compiler out there.
For more detailed information, see Wheeler’s website.
Now, this technique only detects when the binary doesn’t match the source, so someone still needs to examine the compiler source code. But now you only have to examine the source code (a much easier task), not the binary.
It’s interesting: the “trusting trust” attack has actually gotten easier over time, because compilers have gotten increasingly complex, giving attackers more places to hide their attacks. Here’s how you can use a simpler compiler—that you can trust more—to act as a watchdog on the more sophisticated and more complex compiler.