Man-in-the-Middle Attacks

Last week’s dramatic rescue of 15 hostages held by the guerrilla organization FARC was the result of months of intricate deception on the part of the Colombian government. At the center was a classic man-in-the-middle attack.

In a man-in-the-middle attack, the attacker inserts himself between two communicating parties. Both believe they’re talking to each other, and the attacker can delete or modify the communications at will.

The Wall Street Journal reported how this gambit played out in Colombia:

“The plan had a chance of working because, for months, in an operation one army officer likened to a ‘broken telephone,’ military intelligence had been able to convince Ms. Betancourt’s captor, Gerardo Aguilar, a guerrilla known as ‘Cesar,’ that he was communicating with his top bosses in the guerrillas’ seven-man secretariat. Army intelligence convinced top guerrilla leaders that they were talking to Cesar. In reality, both were talking to army intelligence.”

This ploy worked because Cesar and his guerrilla bosses didn’t know one another well. They didn’t recognize one anothers’ voices, and didn’t have a friendship or shared history that could have tipped them off about the ruse. Man-in-the-middle is defeated by context, and the FARC guerrillas didn’t have any.

And that’s why man-in-the-middle, abbreviated MITM in the computer-security community, is such a problem online: Internet communication is often stripped of any context. There’s no way to recognize someone’s face. There’s no way to recognize someone’s voice. When you receive an e-mail purporting to come from a person or organization, you have no idea who actually sent it. When you visit a website, you have no idea if you’re really visiting that website. We all like to pretend that we know who we’re communicating with—and for the most part, of course, there isn’t any attacker inserting himself into our communications—but in reality, we don’t. And there are lots of hacker tools that exploit this unjustified trust, and implement MITM attacks.

Even with context, it’s still possible for MITM to fool both sides—because electronic communications are often intermittent. Imagine that one of the FARC guerrillas became suspicious about who he was talking to. So he asks a question about their shared history as a test: “What did we have for dinner that time last year?” or something like that. On the telephone, the attacker wouldn’t be able to answer quickly, so his ruse would be discovered. But e-mail conversation isn’t synchronous. The attacker could simply pass that question through to the other end of the communications, and when he got the answer back, he would be able to reply.

This is the way MITM attacks work against web-based financial systems. A bank demands authentication from the user: a password, a one-time code from a token or whatever. The attacker sitting in the middle receives the request from the bank and passes it to the user. The user responds to the attacker, who passes that response to the bank. Now the bank assumes it is talking to the legitimate user, and the attacker is free to send transactions directly to the bank. This kind of attack completely bypasses any two-factor authentication mechanisms, and is becoming a more popular identity-theft tactic.

There are cryptographic solutions to MITM attacks, and there are secure web protocols that implement them. Many of them require shared secrets, though, making them useful only in situations where people already know and trust one another.

The NSA-designed STU-III and STE secure telephones solve the MITM problem by embedding the identity of each phone together with its key. (The NSA creates all keys and is trusted by everyone, so this works.) When two phones talk to each other securely, they exchange keys and display the other phone’s identity on a screen. Because the phone is in a secure location, the user now knows who he is talking to, and if the phone displays another organization—as it would if there were a MITM attack in progress—he should hang up.

Zfone, a secure VoIP system, protects against MITM attacks with a short authentication string. After two Zfone terminals exchange keys, both computers display a four-character string. The users are supposed to manually verify that both strings are the same—”my screen says 5C19; what does yours say?”—to ensure that the phones are communicating directly with each other and not with an MITM. The AT&T TSD-3600 worked similarly.

This sort of protection is embedded in SSL, although no one uses it. As it is normally used, SSL provides an encrypted communications link to whoever is at the other end: bank and phishing site alike. And the better phishing sites create valid SSL connections, so as to more effectively fool users. But if the user wanted to, he could manually check the SSL certificate to see if it was issued to “National Bank of Trustworthiness” or “Two Guys With a Computer in Nigeria.”

No one does, though, because you have to both remember and be willing to do the work. (The browsers could make this easier if they wanted to, but they don’t seem to want to.) In the real world, you can easily tell a branch of your bank from a money changer on a street corner. But on the internet, a phishing site can be easily made to look like your bank’s legitimate website. Any method of telling the two apart takes work. And that’s the first step to fooling you with a MITM attack.

Man-in-the-middle isn’t new, and it doesn’t have to be technological. But the internet makes the attacks easier and more powerful, and that’s not going to change anytime soon.

This essay originally appeared on

Posted on July 15, 2008 at 6:47 AM45 Comments


Nicholas Sherlock July 15, 2008 7:14 AM

Web browsers are moving to make the identity of the server participating in a secure connection more obvious. Firefox 3’s new address bar shows the company’s name in green in a special area to the left of the bar if the certificate includes a name.

noah July 15, 2008 7:36 AM

Nicholas is correct. Firefox 3 shows the full name of my bank whenever I go to their website. Of course, that just adds the extra step of getting a name on the certificate that is close enough that I wont notice. Would anyone notice that they’re actually at Wachovea Bank and not Wachovia?

Clive Robinson July 15, 2008 7:39 AM

@ Bruce,

You do not mention that there is another issue with SSL and many other systems used for performing transactions.

That is the communications channel is authenticated but not the transactions.

To make online transactions secure two things have to happen,

1, The transaction must be authenticated in both directions.
2, The method of authentication needs to be via a side channel.

both of these are required to prevent either end of the communications channel being co-opted into being the Man in the middle via malware etc.

The best way to do this is with a hand held token and time dependent secure checksums etc.

Allen July 15, 2008 7:44 AM

Here’s another protocol similar to Zfone’s for bootstrapping a secure channel. This protocol is more general purpose. It can be used to bootstrap non-voice (pure data) channels; it can be used offline (i.e. it does not require that the authentication be performed while communicating over the channel); it can authenticate a previously established session; and it can use other methods for authentication than just personal recognition of the other parties voice, for example, the parties could meet in person, or one could be required to visit the other to produce a photo id, have fingerprints taken, etc. It does all this while being resistant to evesdropping, i.e., it does not pass a secret that if intercepted would compromise the secure channel.

Nicholas Weaver July 15, 2008 7:47 AM

Also, in the absence of crypto, on the current network, any passive adversary can be an active adversary: If you can eavesdrop, you can become a MitM

Blake Kaplan July 15, 2008 8:03 AM


When Firefox 3 shows you the company name on a green background in the URL bar, it means the web site you’re at has an “Extended Verification” certificate. These certificates are harder to get (there is more scrutiny required to get one), so ideally, only real companies should be able to get them. is a good description of the UI that Firefox 3 introduced to try to disambiguate “encrypted connection” and “talking to eBay or PayPal.”

Victor Bogado July 15, 2008 8:08 AM


Side channels don’t work for the attacker can be doing his thing on-line at the same time.

  • User attempts to access bank, but gets to communicate with a bogus server instead.
  • the bogus server, posing as the user start a connection to the real bank.
  • the user enters his credentials, as usual, and the bogus server simply forwards those to the real bank.
  • the bank ask the user for his credentials via the side channel, witch the attacker has no control over, but the user is expecting this so he will simply comply.
  • the attacker can now make any bogus operation he wants.

Sure the bank can ask for confirmation for each operation via the secondary channel, but I would imagine that this would quickly become very annoying to the user.

Of course we can think of another ways to implement this automatically, using a modem connection direct to the bank along side of the internet communication for instance could help to make this two way communication seamless. But on the other hand it would make the setup much more complicated, and even this would still be vulnerable to compromised PCs.

Richard July 15, 2008 8:10 AM

I don’t believe the official line for a minute. I’m sure a ransom was paid and both sides agreed to keep it quiet. The official story is ludicrous and similar to ‘we tapped the guard on the shoulder and while he looked the other way we marched the hostages out’

Nick Lancaster July 15, 2008 8:13 AM

Wasn’t MITM the basis of Comcast’s interfering with users doing torrents? (Comcast told the sending and receiving computers, “Okay, all done.”)

The court ruled against Comcast.

Ulrich Boche July 15, 2008 8:37 AM

There is an additional problem inherent in SSL that makes validation more difficult and insecure: all trusted CAs are treated equally in SSL. If one CA that is trusted by the browser manufacturers is rotten or has its private key stolen, the security of certificate validation in total is gone. There is no provision for things like “the certifcate from Deutsche Bank can only be signed by VeriSign but not from any other CA”. You could of course display the certificate and check the signer, but how would you know which CA should be the right one?

Ulrich Boche

D0R July 15, 2008 8:42 AM

Not to mention the fact that site admins seem not to care for domain mismatch. Often you want to visit a site that uses https and your browser pops up a warning telling you that the digital certificate was released to while you are trying to connect to Or tells you that the digital certificate has expired. Is this a man-in-the-middle attack, or an error of a sloppy administrator? What are you going to do?

Paeniteo July 15, 2008 8:59 AM

“There is no provision for things like “the certifcate from Deutsche Bank can only be signed by VeriSign but not from any other CA”

If an attacker presents you a fraudulent certificate, he could alter that information as well. It would have to be a trustworthy source outside the certificate.
Does not sound easy to get that right.

don't-need-a gun dept. July 15, 2008 9:07 AM

If you don’t want it known, don’t use the telephone.

“Most people think that the standard weapon used by most private investigators is a gun. The fact is, the most powerful weapon a private investigator uses is the telephone.”

Information is more powerful than lethal force. The pen is mightier than the sword. A telephone gives you more options than a gun.

tcliu July 15, 2008 9:13 AM

I don’t get the Zfone authentication. If a MITM attack is in progress, couldn’t the other side fake the response?

“My code is 1234”
MITM: “Mine, too”.
“Ok, great, we’re safe.”

Or do you read alternating digits?

“My code starts with 1”
MITM: “Mine, too, and then, uhhh… a 3?”.

bob July 15, 2008 9:20 AM

I do check certificates when doing anything “important” over the web. But it is still difficult. If the root certificate was issued by “Thawte Security Systems” or “VeriSign Authorization Organization” how do I know those are the same as the “real” company with similar sounding names?

Also one could easily do an MITM attack on a bank customer simply by buying a certificate in the name of a company that sounds similar. If someone has an account at “Security National Bank” a phisher could acquire a cert from a known supplier trusted by the standard browsers in the name of “Security Financial Systems” and it would be close enough to accept.

My mom has recently started paying bills electronically (will wonders never cease?) and she has several bills which are serviced through third-party billing/receiving companies that have NOTHING to identify them with the service provider she is paying – yet its the correct recipient. She asks if they are legit (smart woman) yet I have no way of telling, other than 1 month later the next bill shows the previous one wasnt paid by which time it is too late.

Sam Greenfield July 15, 2008 9:25 AM

@Richard: “I’m sure a ransom was paid and both sides agreed to keep it quiet.”

A Swiss radio station apparently reported that a ransom had been paid, citing an anonymous source within Columbia. However, I am unable to find the original report from a Swiss station. Even if we could find the original report, they seemed to have based their entire report on an anonymous source.

Why should I believe one account over another?

Allen July 15, 2008 10:03 AM


Zfone is based on the users being able to recognize the other’s voice over the phone AND it assumes a MiTM has no ability to forge the reading of the digits or the confirmation of correctness. There is no need to alternate because the one trusts the other to tell the truth. If you think otherwise, please explain the attack.

Michael Ash July 15, 2008 10:11 AM

@ tcliu

The idea with Zfone is to prove that the person you’re hearing is also the person your computer is sending packets to. It’s protecting against a MITM attack where the attacker forwards the VoIP packets but sits in an intermediate position to break the encryption and snoop on the conversation. It doesn’t, and can’t, protect against a scenario where the attacker is actually generating the audio you hear. Thus Zfone will protect you against eavesdropping when talking to a friend (because you’ll know if you’re talking to a different person instead) but it won’t protect you if you’re talking to a stranger for the first time.

ITbloke July 15, 2008 10:17 AM


The root certificates are installed in your browser – these are not the same thing as the $25 cert. you can buy online. This is why self-signed certs. don’t automatically work on browsers, and why certain cheap certs. don’t work universally.

Regarding the name presented for the ‘financial institution,’ you’re quite correct – the aforementioned dirt-cheap certs. don’t do any legwork and the name that’s entered may be fake. Regrettably, this was meant to originally be a ‘key’ (groan) part of SSL – that a business’ DUNS number or other identifier was used to verify a business, and certs. wouldn’t be handed out willy-nilly. Now…this is allegedly being handled by very expensive Extended Validation (EV) process, which gets extra ‘chrome’ in modern browsers, stating that the company’s identity was actually verified by the cert. authority (issuer).

TooBad July 15, 2008 10:38 AM

@Ulrich Boche
“There is an additional problem inherent in SSL…”

Unfortunately, there is more than one problem here. SSL server authentication could have eliminated problems like phishing.

The biggest problem has been that the whole Certificate Authority (CA) system has been corrupted by greed, plain and simple. [One can say the same about domain registration, but that is a different, but similar, discussion].

For the CA system to really work, there needs to be just a few, trusted CAs (the same can be said for domain registation). Where all issued SSL server certificates are properly vetted, so one can easily determine the level of trust associated with that certificate.

This was the case, at first, there was one CA, the “RSA Secure Server” CA as I recall. One could easily determine the level of trust associated with the issuing of SSL server certs by that CA (and hence, an SSL connection established from it).

Then, the CA domain grew to around 10 CAs. While not good, it was still ok as this small group of CAs (i.e. Verisign, GTE, etc.) were well regarded and maintained their integrity to ensure a high degree of trust associated with SSL server certificate issuing.

At one point, there was serious discussion about a US government agency (i.e. USPS) becoming “the” CA. While a great idea (providing lots of localized points of presence for physical verification of SSL server certificate owners), it never was able to get properly established.

Then, everything fell apart, as dozens and dozens of CAs were allowed to exist, and all trust in SSL server certificate issuing was essentially eliminated as these corrupt CAs got greedy, issuing things like “quicky” unverified SSL certs where all one needed was an unverified “copycat” domain name and a stolen credit card (Phishing’R’Us). There was no way to determine the level of trust from these “bogus” SSL server certificates and properly vetted SSL server certificates.

The result being that while SSL server authentication provides one of the best ways to prevent phishing, the trusted SSL server certificate needed to make it work has been corrupted by the CA industry.

The EV SSL server certificates are an attempt by the CAs to try and gain some credibility back into the CA system they have corrupted by their own greed. Unfortunately, I expect that same greed by the CAs will lead to such variablility in the EV SSL server certificate issuing vetting process, that while better, EV SSL server certs will end up being no better than what we have today.

MikeA July 15, 2008 11:19 AM

Three points:
1) I don’t recall any time when Verisign had a particularly good reputation.
2) As other have noted, it is not only possible but likely that “National Bank of Trustworthiness” has outsourced their website to “Two Guys With a Computer in Nigeria.”. Just last night I had what looked a lot like a MITM Phish experience with ADP. As they were contracted by my employer to handle payroll, “just moving to another financial services company” is not an option, unless I also want to “just move to another employer”
3) “just moving” is rarely an option anyway, because pretty much all financial services companies care more for animated advertising than site security. There is no place to move to.
(and if there was, they’d be soon bought by one of the bozos with the extra cash provided by the debit cards those animated ads sell)

Eric Johnson July 15, 2008 11:38 AM

For what it is worth, Ross Anderson’s book “Security Engineering” (which I bought based on Bruce’s recommendation) calls this a “middleperson” attack, as the newer term to use – presumably because it is gender neutral.

Markk July 15, 2008 12:00 PM

Isn’t this a classic case of the technology to solve this being there all along and no one implementing it? I mean PK cryptography was made for essentially this.

With a protocol built in to a browser it would be only a little more difficult to use than https links. Say you want to authenticate with your bank and do secured communication, without all the central key authority stuff.

1-Bank publishes its public key on its Web site, in brochures, etc. You, as a user, have to trust that this is really from the bank, but that is ok in the real world – you could go to a bank branch, look at third party lists, etc.

2- You put that key into your browser, by hand – this avoids all the trusting of DNS, and cert authorities. You can keep it as part of the bookmarks for the site, til its changed. This is basically the difference from https – a manual step – you only have to it do once for a given entity you want to talk to.

3 – Your Browser creates a message with a session key (and maybe extras), and encrypts with the public key of the bank, and sends to bank, bank now uses this key. Essentially like https.

This could be elaborated to close deficiencies, but it solves man in the middle with little hassle and no central authorities needed, just trust in publication, and that the private key of the pair is not broken.

This is basic stuff with Public key and easily doable. I’m surprised that it is not a built in like https not as a replacement but as an augment. SSL is great as far as it goes and works fine for most things its used for.

Jacson Querubin July 15, 2008 12:35 PM


There are some bloggers and news in Latin America that told that FARC were paid for the hostages – about US$ 20 millions.

Could be more simple that an MITM attack..

It’s always about the trade-off.

Brandioch Conner July 15, 2008 1:06 PM

@Victor Bogado
“- the bank ask the user for his credentials via the side channel, witch the attacker has no control over, but the user is expecting this so he will simply comply.”

That is only correct if the “verification” is nothing more than authorizing ANY transaction.

If the bank simply called the phone number that you have on your account and asked “do you wish to transfer $X to account Y press 1 for “yes” or press 2 for “no”” you’d be almost completely safe.

“Sure the bank can ask for confirmation for each operation via the secondary channel, but I would imagine that this would quickly become very annoying to the user.”

How many online transactions does the average person do in a month? I’d be happy with that situation.

Particularly since this should only apply to new transactions. If you set up a bill payment on a recurring basis, you should not be called for that other than when you initially set it up OR when you make a change to it.

roboto7 July 15, 2008 1:13 PM

“But if the user wanted to, he could manually check the SSL certificate to see if it was issued to “National Bank of Trustworthiness” or “Two Guys With a Computer in Nigeria.””

And does the user know to check for?

“The browsers could make this easier if they wanted to, but they don’t seem to want to.”

Every browser i’ve seen these days refuses to display a site (or at least displays a warning in bright red) if you try to connect over SSL to a website where the cert doesn’t match the domain name, or the cert doesn’t have an “approved” chain of trust.

The user doesn’t need to check, the browser already does the check for them.

I do have one beef with browsers though. If you try to enter info on a site without SSL, you only ever get the one “you’re about to submit data over an unsecured connection” warning, and everyone disables that warning the first chance they get. Opera does this, Firefox does this, IE does this. With good reason — that dialog so intrusive that you develop a click-thru habit.

It would be nice if there was a ‘fail safe’, maybe a little thought balloon that appeared in red for 5 seconds when you start typing in a web form that said “this connection is not secure. do not enter your password or credit card information.” Best yet, it would be nice if this only appeared rarely, like whenever entering text in an obscured text box (password) or when entering in long strings of digits (credit cards).

SSL connections aren’t the biggest problem, the browsers already try hard to protect the user from themselves. It’s unsecured HTTP connections that are weak.

(personal favorite: a friend showed me that his bank has an unsecured logon on their homepage,

Fraud Guy July 15, 2008 1:57 PM


While working at an ecommerce company, there was an encrypted logon box on an unsecured page; as the PCI compliance checker, I was constantly assured by the IT folk that the data was not left on the unsecured page, and as the page went to the secured login setup, it was protected.

I could not force a full secured page on them, and had to live with it despite misgivings

Richard July 15, 2008 2:10 PM

@Sam Greenfield

Your point is well taken. I like to use the test ‘what is more likely’.

What is more likely:

1) A bunch of heavily armed outlaws were tricked into handing over a high value hostage by social engineering

2) A bunch of heavily armed outlaws were paid a ransom to obtain the hostages’ freedom

AnAussie July 15, 2008 5:07 PM


At least one bank here in Australia does this. An SMS is sent to your mobile detailing who and how much the transaction is for and a random code to confirm it. Enter the code on the webpage and the transaction is processed.

For regular payments, you can pre-authenticate the payee the same way, then it doesn’t bug you when your paying them.

Clive Robinson July 15, 2008 6:08 PM

@ Victor Bogado,

“- the attacker can now make any bogus operation he wants.”

No they can’t, the transaction has to be authenticated in both directions in a secure way is the most important part (which needs something a lot more complicated than something a user can do, hence one reason for a token, all be it a software agent, mobile phone, or specialised hardware dongle with a keypad or other method of getting data in).

The side channel is only required to extend the communications path beyond that of the PC to prevent the MITM attack taking place via a “malware shim” at the I/O driver level on the PC.

So a software agent running on either the PC or a mobile phone would be vulnerable.

Likewise an SMS message has problems as well.

The correct solution is to use a hardware token that is effectivly “immutable” and therefore not subject to tampering by malware.

Further as a hardware token, there should not realy be a viable vector/channel with sufficient bandwidth to make a malware attack on it possible let alone feasable..

I first thought about this problem back in 2000 and outlined it at a seminar. Since then I have posted to this and a couple of other security blogs on a number of occasions, including a fairly good outline of how the system might work.

(and yes it would appear that atleast one company agrees with me as they have started marketing a system that differs only in that they use a camera to get a secure authentication into the token.)

The only two surprises to me have been,

1, That it has taken 7 years for attacks of this sort (MITM and Malware shim) to be seen.

2, That there has been no serious investigation of how to prevent the attacks in that time.

I also made a prediction this year that if mobile phone’s had financial authentication apps developed for them by banks (or their agents) that it would take less than four years for malware to be developed to attack it (only time will tell if I am right or not 😉
I will dig out the URL’s to my earlier posts if people are interested?

Dampener July 15, 2008 6:33 PM

I think these should be called Human in the Middle attacks (HuMid). Then protocols like SSL could be called de-HuMid-ifiers.

David Keech July 15, 2008 7:03 PM

I’m going to have to vote for Dampener’s suggestion. My website definitely needs a de-HuMid-ifier.

As far as the old term “Man in the middle” is concerned, it is already gender neutral.

The word “man” is derived from the same Latin as the word “manual” which means “by hand”. So a “chairman” is “the hand that controls the meeting” and “manual labour” is labour done by hand, not done by a man.

A man in the middle, therefore, is the “hand” controlling the flow of information and is not necessarily male.

Dylan July 15, 2008 7:19 PM

Does anyone have a link (that is not subscription-only) to more info on the hostage rescue? I can only find generic news items, and no details of the attack vector, background, etc.

It was funny to see that one news site said that US involvement amounted to giving the Columbian military personnel “acting lessons”. So appropriate, for a country that is all about IP and “acting tough.”

G July 15, 2008 8:11 PM

“There’s no way to recognize someone’s voice.”

Bullshit, there are well known programs to derive a signature from an individual’s voice. Common knowledge.

Radovan Semancik July 16, 2008 4:29 AM

And even if somebody really bothers to check the SSL certificate, there are at least few CAs that are OK issuing the certificate of “National Bank of Trustworthiness” to the Two Guys With a Computer in Nigeria, based on just few barely readable (fake) fax messages.

Therefore a system that is considered cryptographically secure in cyberspace is spoiled by insufficient security of realspace processes. As usual.

Nyhm July 16, 2008 8:40 AM

@TooBad – I tend to agree with some of your points. However, a totalitarian structure is not the best answer. We need a distributed trust system; it’s just that a CA hierarchy isn’t the solution (but it’s the best we’ve got for now).

Nonetheless, under today’s CA structure, I totally agree that too many CAs spoil the trust. My tipping point was when I noticed that is verified by, Inc. Yes, they’re their own CA now – get your wholesale certificates.

Actually, its perfectly fine (mathematically speaking) for anyone to create their own certificates. What irked me was that Firefox3 already accepts the GoDaddy authority. Who are the FF3 developers to tell me who to trust?

Fraud Guy July 16, 2008 9:16 AM


Yes, I would like to see your prior predictions…it’s like pulling up the old Calculated Risk files and finding them calling the mortgage crisis several years ago.

Thank you,

Fraud Guy

Zooko O'Whielacronx July 16, 2008 11:21 AM

Dear Bruce Schneier:

Thank you for the mention of Zfone in your recent CRYPTO-GRAM:

On Jul 15, 2008, at 1:21 AM, Bruce Schneier wrote:

 Man-in-the-Middle Attacks

Zfone, a secure VoIP system, protects against MITM attacks with a
short authentication string. After two Zfone terminals exchange
keys, both computers display a four-character string. The users are
supposed to manually verify that both strings are the same — “my
screen says 5C19; what does yours say?” — to ensure that the phones
are communicating directly with each other and not with an MITM. The
AT&T TSD-3600 worked similarly.

This is correct, but it omits two other pieces of the design which
together make Zfone stronger and more convenient than traditional
encrypting phones.

You can see it on the image on this page:

This user interface combines traditional voice check authentication,
the key-continuity defense against MITM attack, and “sticky-note

This combination is intended to make it risky and difficult to launch
a MITM attack on Zfone users even given that most users don’t spend a
lot of effort on security.

I helped Phil design the Zfone protocol in 2006, and I occasionally
consult with him nowadays, as he is getting it standardized and
deployed. (I’ve also, with a certain well-known cryptographer,
invented with a newfangled crypto trick to defeat MITM, but that idea
hasn’t gone anywhere yet. If you’re interested I’d be delighted to
explain it — the rest of this letter is not about my new crypto
trick, but instead about Zfone.)

To see how we have tried to make the MITM’s job as uncomfortable as
possible, consider things from the attacker’s perspective:

Alice and Bob are dialing one another using their new Zfones. You, as
a potential Man-In-The-Middle, have to decide right now whether to do
a MITM attack on the Diffie-Hellman key agreement or not. If you
don’t, then you will be unable to listen in on this phone call, but
worse you will also be unable to listen in on future phone calls
without triggering both of their Zfone UIs to change dramatically —
removing their friend’s name from the sticky-note, resetting the
“secure-since” date from the date of their first call to today’s date,
and unsetting the “Compare with partner” field.

On the other hand, if you do launch the MITM attack now, then if Alice
and Bob later make a call which you do not intercept (for example
because one of them travels and connects from a different network), or
if Alice and Bob, on this or a subsequent call, do a
voice-authentication check then you run the risk of being discovered.

I’m aware of research showing that web browser users are mostly
oblivious to security signals on the edges of the web browser, but the
Zfone UI is different from those and it may turn out to be effective
in the hands of normal users:

  1. If a MITM attacks, then the part of the UI that you actually look
    at and use — the name of who you are talking to — disappears and
    leaves an empty space.
  2. The state of the UI is unchanged from call to call, so you do not
    need to examine it nor do anything to it when you are making a call to
    someone with whom you’ve previously talked. It is only in the case of
    a MITM attack (or in the case that your friend accidentally deleted
    their Zfone directory and had to reinstall) that the UI changes.
  3. The MITM has to commit to whether or not to launch the attack at
    an earlier time, while in ignorance of what the users will do later,
    including on subsequent calls.

To run a full MITM attack with reasonable stealth would be difficult
— in addition to the normal requirements of intercepting and
modifying all packets, it would also probably require voice
recognition in the loop to detect if the users eventually do the
voice-check authentication and abort the attack in a way that looks
like normal software bugginess. Even if you did implement this
attack, you might be forced to abort right when the users are doing
the voice-check authentication, which would tend to arouse suspicion.

If you instead launch a cheap MITM attack without voice recognition in
the loop and without comprehensive interception of all of their
packets (even on subsequent calls from different locations), then you
run the risk of being exposed by the users performing a voice
authentication check and finding that the check words differ.

There are many potential attackers who would be deterred by this risk.
There is no safe state that the attacker can reach where they can
continue to eavesdrop without fear of detection. As long as they are
running the attack they are incurring an ongoing risk of being
detected or at least of arousing the suspicions of their victims.

This risk of subsequent detection is incurred even if most users are
careless most of the time — it only takes one voice-check
authentication to retroactively expose the earlier MITM attack on
earlier calls. They might do this for example because they’ve decided
to talk about something sensitive. (I remember when I switched to
using hushmail to exchange letters with the girl who is now my wife. It
was when our correspondance started to become a little more intimate.)

Caveat: actually there is a safe state that the attacker can reach,
if they can reliably, in near-realtime, fake the victim’s voices doing
the voice-authentication check. This is what Phil calls “The Rich
Little Attack”. I currently believe that such an attack is currently
expensive (i.e. the most cost-effective way to do it is probably to
hire a talented human or several). The crypto trick that I invented
on has the potential to defeat even this attack, but as I mentioned I
couldn’t figure out how to make it practical.

Okay, this letter was way longer than I intended, but I hope it shows
why Zfone has the potential to be substantially stronger, even in the
hands of users with imperfect security practices, than previous (and
current) alternatives.

By the way, good job on organizing the workshop on security and human
behavior. Like behavioural economics and “experimental philosophy”
(as someone recently termed it), human-behavior-oriented security is a
much needed perspective.

I liked your rechristening of “polite devices” into “selective device
jamming”. I would point out that politeness among humans is
voluntary. (When it becomes involuntary it is no longer a matter of

Most people, including me, would be happy to configure their devices
to voluntarily do things like silence themselves in theatres. The
capability security folks have a paradigm for the implementation of
such behavior, called “Voluntary Oblivious Compliance”.


Zooko O’Whielacronx

P.S. Feel free to publish or re-use this letter as you see fit, with
one caveat: a certain well-known cryptographer is, as you probably
know, jealous of his privacy, and it might be polite to remove his
name before publishing this letter.

P.P.S. Dear Phil: writing the above has made me think that Zfone
would be more secure if it didn’t have those “Secure” and “Clear”
buttons on the UI. There would be less cognitive clutter to distract
the users from the important stuff. Maybe the “Secure” and “Clear”
buttons could be moved into the “Advanced Usage” window, if not
removed altogether.

Archimerged July 16, 2008 11:36 AM

@Fraud Guy

To demonstrate the insecurity of accepting a password on an http page and sending it via POST to an https page, modify the http page on a router so that it leaks the password. Then present the password to the people who assure you “it is just as secure.”

Fraud Guy July 16, 2008 4:34 PM


I was not the tech guy, but the fraud prevention guy, and so was not able to test their assertions at the time. I was more of a cat herder for the project. Now if I had a chance to interview them for a time, instead of trying to get a quick answer in a committee meeting that the COO was trying to move along…

However, I did have them sign off that they guaranteed the security of their set up.

Jon Sowden July 16, 2008 8:27 PM

Follow up on the Betancourt rescue

Abuse of trust. Which is still a security hack, but one that makes us all less safe, even though in this case it resulted in the release of Betancourt. The story as reported doesn’t gel to me … how could the people on the mission not notice that one of them was wearing a Red Cross.

Short term gain for long term loss. What a shame.

sooth sayer July 16, 2008 11:47 PM


Considering this “technique” has been used for a positive thing, can’t you recommend the term to be called “man in the middle defense” instead.

And maybe you should use the term “deception” with some care .. is deceiving a thief deception?

Hasn’t the legal system for the past 1000 years admitted the right of self-defense and has not called it murder.

You are a smart man with wrong politics .. may be you would change your views.

Clive Robinson July 17, 2008 1:34 AM

@ sooth sayer,

“Considering this “technique” has been used for a positive thing, can’t you recommend the term to be called “man in the middle defense” instead.”

I think the Guys over at Cambridge Labs have already taken that name for their use of a hardware device that a consumer fits between their smartcard and an EPOS reader to record the real details of the transaction and not what the EPOS terminal is telling them.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.