August 15, 2000

by Bruce Schneier
Founder and CTO
Counterpane Internet Security, Inc.
schneier@schneier.com
<http://www.counterpane.com>

A free monthly newsletter providing summaries, analyses, insights, and commentaries on computer security and cryptography.

Back issues are available at <http://www.counterpane.com>. To subscribe or unsubscribe, see below.

In this issue:

Secrets and Lies: Digital Security in a Networked World
Microsoft Vulnerabilities, Publicity, and Virus-Based Fixes
News
Counterpane Internet Security News
Crypto-Gram Reprints
European “Crime in Cyberspace” Convention
The Doghouse: Authentica
Bluetooth
Comments from Readers

Secrets and Lies: Digital Security in a Networked World

I’ve written a new book.

I started writing this book in 1997; it was originally due to the publisher by April 1998. I eventually delivered it in April 2000, two years late. I have never before missed a publication deadline: books, articles, or essays. I pride myself on timeliness: A piece of writing is finished when it’s due, not when it’s done.

This book was different. I got two-thirds of the way through the book without giving the reader any hope at all. And it was about then I realized that I didn’t have the hope to give. I had reached the limitations of what I thought security technology could do. I had to hide the manuscript away for over a year; it was too depressing to work on.

I came to security from cryptography, and framed the problem with classical cryptography thinking. Most writings about security come from this perspective, and it can be summed up pretty easily: Security threats are to be avoided using preventive countermeasures.

For decades we have used this approach to computer security. We draw boxes around the different players and lines between them. We define different attackers—eavesdroppers, impersonators, thieves—and their capabilities. We use preventive countermeasures like encryption and access control to avoid different threats. If we can avoid the threats, we’ve won. If we can’t, we’ve lost.

Imagine my surprise when I learned that the world doesn’t work this way.

I had my epiphany in April 1999: that security was about risk management, that detection and response were just as important as prevention, and that reducing the “window of exposure” for an enterprise is security’s real purpose. I was finally able to finish the book: offer solutions to the problems I posed, a way out of the darkness, hope for the future of computer security.

“Secrets and Lies” discusses computer security in this context, in words that a business audience will understand. It explains, in my typical style, how different security technologies work and how they fail. It discusses the process of security: what the threats are, who the attackers are, and how to live in their world.

It’ll change the way you think about computer security. I’m very proud of it.

Information about the book:
<http://www.schneier.com/book-sandl.html>

Order the book (at a 20% discount) from Amazon:
<http://www.amazon.com/exec/obidos/ASIN/0471253111/…>

If you use that URL to order the book from Amazon, a portion of the purchase price will go to EPIC.

Microsoft Vulnerabilities, Publicity, and Virus-Based Fixes

The latest tale of security gaps in Microsoft Corp.’s software is a complicated story, and there are a lot of lessons to take away—so let’s take it chronologically.

On June 27th, Georgi Gunniski discovered a new vulnerability in Internet Explorer (4.0 or higher) and Microsoft Access (97 or 2000), running on Windows (95, 98, NT 4.0, 2000). An attacker can compromise a user’s system by getting the user to read an HTML e-mail message (not an attachment) or visit a Web site.

This is a serious problem, and has the potential to result in new and virulent malware. But it requires Microsoft Access to be installed on the victim’s computer, which, while common, is by no means universal. A virus that exploits this vulnerability will not spread as widely as, say, Melissa. In any case, Microsoft published a fix on July 14th, and I urge everyone to install it.

On July 17th, SANS promulgated an e-mail warning people of the “most dangerous flaw found in Windows workstations.” I can’t really figure this e-mail out; it seems to be primarily a grab for press coverage. Some of it is suspiciously vague: “We developed this exploit further and realized that this is one of the most serious exploits of Windows workstations in the last several years” “Developed”? How? No one says. Some of it brags: “Microsoft asked us not to release the details until they had a fix.” “Release the details”? But the original Bugtraq posting was pretty explanatory, and SANS has not released anything new.

Still, the SANS e-mail received a lot more publicity than the Bugtraq announcement or the Microsoft patch, so it’s hard to complain too much.

But the SANS announcement had a much more disturbing section: “It may be possible to fix this vulnerability automatically, via an e-mail without asking every user to take action. The concept is similar to using a slightly modified version of a virus to provide immunity against infection. SANS is offering a $500 prize (and a few minutes of fame) to the first person who sends us a practical automated solution that companies can use, quickly, easily, and (relatively) painlessly to protect all vulnerable systems.” (This paragraph is no longer on the Web site, which claims that “winning entries have been received.”)

This is a really, really dumb idea, and we should put a stop to this kind of thinking immediately. Every once in a while someone comes up with the idea of using viruses for good. Writing a virus that exploits a particular security vulnerability in order to close that vulnerability sounds particularly poetic.

Here’s why it’s such a bad idea. First, there’s no audit trail of the patch. No system administrator wants to say: “Well, I did try to infect our systems with a virus to fix the problem, but I don’t know if it worked in every case.”

Second, there’s no way to test that the virus will work properly on the Internet. Would it clog up mail servers and shut down networks? Would it properly self-destruct when all mail clients were patched? How would it deal with multiple copies of itself?

And third, it would be easy to get wrong and hard to recover from. Experimentation, most of it involuntary, proves that viruses are very hard to debug successfully. Some viruses were written to propagate harmlessly, but did damage because of bugs in their code. Intentional experimentation proves that in your average office environment, the code that successfully patches one machine won’t work on another, sometimes with fatal results. Combining the two is fraught with danger. Every system administrator who’s ever automated software distribution has had the “I just automatically, with the press of a button, destroyed the software on hundreds of machines at once!” experience. And that’s with systems that you can *stop*; self-propagating systems don’t even let you shut them down when you find the problem.

In any case, the SANS announcement was made even more confusing by the announcement of another Microsoft vulnerability at the same time…one that I think is even more serious than the one SANS publicized. (The vulnerability was first discovered on July 2nd, but was independently discovered and published on Bugtraq on July 18th.)

A buffer overflow in Microsoft Outlook or Outlook Express allows an attacker to execute arbitrary code on a victim’s machine just by sending him an e-mail. In Outlook Express, the victim doesn’t even have to open the e-mail, or preview it. All he has to do is download it. In Outlook, he has to read it.

That’s the bad news. The good news is that it only is a vulnerability for users who have POP or IMAP installed; those using Outlook’s default corporate configuration are not vulnerable. (Home users who link to commercial ISPs are much more likely to be vulnerable.) So again, a virus that exploits this vulnerability would be dangerous and unpleasant, but would not spread unchecked.

Microsoft has a fix. Originally (on July 18th) it required you to upgrade your version of Outlook or Outlook Express, but two days later Microsoft did the right thing and issued a patch. (In typical Microsoft fashion, it isn’t a patch for all versions, although they claim that at the link site. If you’re running Outlook Express 4.0, your only option is to install the upgrade; the patch for the 4.0 version is “coming soon.”) SANS issued another e-mail on July 21st, with more dire warnings: “Please fix this before you go home today. And if you have gone home, go back to the office and fix it.” In my opinion, this warning blew the threat completely out of proportion, and was irresponsible to send. SANS made it sound like a virus attack already in progress, not a new vulnerability that someday might be exploited. And right on the heels of the previous warning, it got lost in the noise. When I received the second SANS e-mail, I thought it was another reminder for the first vulnerability. I’ll bet that many users were similarly confused, and ignored it as well.

There are several lessons here.

1. Computer programs have two sorts of vulnerabilities, nicely illustrated by these two attacks. First, they have vulnerabilities connected to the basic design of the operating system they run on and the way it chooses to interlink programs; the Access attack demonstrates this. Second, they have vulnerabilities based on coding mistakes; the buffer overflow problem is an example.

2. It’s not enough to release a patch. The press often gets this wrong. They think the sequence is: vulnerability publicized, patch released, security restored. In reality, it doesn’t work that way. You don’t regain security until you install the patch. Even though both of these vulnerabilities have been patched, I predict attack tools that use them. Many users just won’t bother installing these patches. For publicizing the two vulnerabilities, SANS is to be commended.

3. Sensationalizing vulnerabilities will backfire. Both of these vulnerabilities are serious, but neither is monumental. Calling something “the most dangerous flaw” leads people to trivialize other flaws. I worry about the public being completely unable to determine what is important. We’ve seen viruses that fizzle, and others that run rampant. We’ve seen vulnerabilities that look serious but don’t amount to anything, and ones that are trivial and exploited again and again. SANS needs to be a voice of reason, not of hyperbole.

4. Writing a virus to exploit a vulnerability is a bad idea, even if the goal of that virus is to close that vulnerability. Viruses, by their very nature, spread in a chaotic and unchecked manner; good system administration is anything but.

5. There are still lots of serious vulnerabilities in Microsoft products, and in the interactions between products, waiting to be discovered.

The Access/IE vulnerability:
<http://www.securityfocus.com/bid/1398>
<http://www.computerworld.com/cwi/story/…>

The SANS announcement:
<http://www.sans.org/newlook/resources/win_flaw.htm>

Microsoft’s “workaround”:
<http://www.microsoft.com/technet/security/bulletin/…>

The Outlook vulnerability:
<http://www.securityfocus.com/bid/1481>

Reports on the vulnerability:
<http://www.securityfocus.com/news/62>
<http://www.computerworld.com/cwi/story/…>

Microsoft’s fix:
<http://www.microsoft.com/windows/ie/download/…> [link moved to http://www.microsoft.com/windows/ie/downloads/…]
<http://www.microsoft.com/technet/security/bulletin/…>

This article originally appeared in:
<http://www.zdnet.com/zdnn/stories/comment/…>

News

Java security: trusted security providers.
<http://metalab.unc.edu/javafaq/reports/JCE_1.2.1.html>

I had nothing to do with this, but thought it was funny:
<http://segfault.org/story.phtml?id=396f3e5c-0958dfa0>

Quantum cryptography and gravity waves:
<http://www.newscientist.com/news/news_224738.html>

Snake-oil alert! Even TISC gets taken once in a while.
<http://tisc.corecom.com/newsletters/213.html>

Building the perfect virus. An interesting and disturbing article:
<http://www.hackernews.com/bufferoverflow/99/nitmar/…>

The U.S. announces new crypto regulations:
<http://www.wired.com/news/politics/0,1283,37617,00.html> White House statements:
<http://cryptome.org/us-crypto-up.htm>

Kevin Mitnick teaches social engineering:
<http://www.zdnet.com/zdnn/stories/news/…>

Password Safe:
<http://www.zdnet.com/sp/stories/column/…>

If you think cookies are bad, meet Web bugs:
<http://www.ntsecurity.net/Articles/Index.cfm?…>

Meanwhile, Microsoft adds cookie privacy features to Internet Explorer:
<http://www.wired.com/news/business/0,1367,37703,00.html> They claim to be the first browser to do so, but Netscape has had these features for a while:
<http://www.wired.com/news/business/0,1367,37723,00.html>

A URL scam. A Web site in Russia, paypai.com, was masquerading as paypal.com. The scam was to send PayPal users e-mails asking them to log in, with the fake URL in the e-mail message. The user clicked on the e-mail link and got the fake page, which looked like the real page. Then the user entered his username and password, which the fake site stole.
<http://www.zdnet.com/zdnn/stories/news/…>
<http://www.msnbc.com/news/435937.asp>
Then, days after this story broke, people start getting e-mails asking them to click on <http://www.paypal.x.com> to see if they’d won a sweepstakes. X.com recently purchased Confinity, the creators of PayPal…but who is going to know that? I thought the X.com e-mail was another scam, and I’ll bet others did, too. This is an excellent illustration of the problems of lousy authentication on the Internet.

Gregory Benford on the future of privacy:
<http://www.wired.com/news/technology/…>

I assume you’ve all read about the FBI’s Carnivore Internet wiretapping device. Here are some essays you may have missed:
<http://www.sfgate.com/cgi-bin/article.cgi?file=/…>
<http://www.crypto.com/papers/opentap.html>

Review of personal firewalls:
<http://securityportal.com/cover/coverstory20000717.html>

A good editorial on the problems with the U.S. electronic signature law:
<http://www.nwfusion.com/columnists/2000/0724works.html>

Good article on intrusion-detection systems and the problems of false positives:
<http://www.zdnet.com/eweek/stories/general/…>

Cybercrime and law enforcement: an academic legal paper. Interesting reading.
PDF: <http://www.sinrodlaw.com/CyberCrime.pdf> MS Word format: <http://www.sinrodlaw.com/cybercrime.doc>

Editorial on why computer security is “different” from the rest of IT:
<http://www.zdnet.com/enterprise/stories/main/…>

“Tangled Web: Tales of Digital Crime from the Shadows of Cyberspace”: an excellent book on cybercrime by Richard Power:
<http://www.amazon.com/exec/obidos/ASIN/078972443X/…>

William Friedman filed a patent application for an Enigma-like encryption device in 1933. The Patent Office awarded the patent this month.
<http://www.patents.ibm.com/details?…> It looks like a patent for the M-229, or maybe the M-134a. It’s hard to tell.

Draconian cyber-surveillance in the UK:
<http://www.mercurycenter.com/svtech/news/indepth/…>

Good article on the liability of programmers who write malware, from Salon:
<http://salon.com/tech/feature/2000/08/07/…>

Is the Web in for more attacks?
<http://www.zdnet.com/zdnn/stories/news/…>

Security vulnerability in Adobe Acrobat. An attacker could create a pdf file that, when viewed, exploits a buffer overflow and runs arbitrary code on the victim’s machine. Here’s the patch:
<http://www.adobe.com/misc/pdfsecurity.html> Interesting note on the page: “There have been reports of fraudulent security patches being distributed through e-mail. Adobe will distribute patches only through the Adobe Web site and not by e-mail.” No word, though, on how we are supposed to authenticate the Web site.

The debate continues on script kiddies:
<http://www.nwfusion.com/news/2000/0727holes.html>

Excellent article on non-repudiation:
<http://firstmonday.org/issues/issue5_8/mccullagh/…>

Counterpane Internet Security News

Forbes profiled Counterpane, Bruce Schneier, and his new book:
<http://www.forbes.com/tool/html/00/jul/0731/feat.htm>

The Applied Cryptography Source Code Disk Set can now be exported to any country except these seven embargoed nations: Cuba, Iran, Iraq, Libya, North Korea, Sudan, and Syria. For details, visit:
<http://www.schneier.com/book-applied-source.html>

Crypto-Gram Reprints

A Hardware DES Cracker:
<http://www.schneier.com/…>

Biometrics: Truths and Fictions:
<http://www.schneier.com/…>

Back Orifice 2000:
<http://www.schneier.com/…>

Web-Based Encrypted E-Mail:
<http://www.schneier.com/…>

European “Crime in Cyberspace” Convention

The Council of Europe recently released a draft of a document called the “Draft Convention on Cybercrime.” This document is meant as an international treaty governing “cybercrime,” and attempts to standardize laws to make prosecuting hackers easier (some countries have no laws specifically governing computer attacks).

It’s a well-intentioned effort, but one provision has the potential to seriously harm security research. It’s the provision that makes attack tools illegal.

I’ve talked about this already with respect to similar American laws. A long list of security professionals sent a letter regarding this issue:

“We are concerned that some portions of the proposed treaty may inadvertently result in criminalizing techniques and software commonly used to make computer systems resistant to attack. Signatory states passing legislation to implement the treaty may endanger the security of their computer systems, because computer users in those countries will not be able to adequately protect their computer systems and the education of information protection specialists will be hindered.

“Critical to the protection of computer systems and infrastructure is the ability to

Test software for weaknesses
Verify the presence of defects in computer systems Exchange vulnerability information

“System administrators, researchers, consultants, and companies all routinely develop, use, and share software designed to exercise known and suspected vulnerabilities. Academic institutions use these tools to educate students and in research to develop improved defenses. Our combined experience suggests that it is impossible to reliably distinguish software used in computer crime from that used for these legitimate purposes. In fact, they are often identical.

“Currently, the draft treaty as written may be misinterpreted regarding the use, distribution, and possession of software that could be used to violate the security of computer systems. We agree that damaging or breaking into computer systems is wrong and we unequivocally support laws against such inappropriate behavior. We affirm that a goal of the treaty and resulting legislation should be to permit the development and application of good security measures. However, legislation that criminalizes security software development, distribution, and use is counter to that goal, as it would adversely impact security practitioners, researchers, and educators.”

The report:
<http://conventions.coe.int/treaty/en/projets/…>

Letter from dozens of security professionals criticizing the report:
<http://www.cerias.purdue.edu/homes/spaf/coe/…>

An essay questioning the report:
<http://securityportal.com/topnews/…>

The Doghouse: Authentica

This is just another company that believes it can secure digital content on another user’s computer. Of course it’s snake oil, and normally I wouldn’t bother even listing them. But they 1) use my name, and 2) profoundly don’t get it.

Question 11 of their FAQ reads: “How secure is information I’ve protected with PageVault? According to Bruce Schneier, an encryption expert, it would take one trillion dollars worth of computers a trillion years to break 128-bit encryption—the kind used in PageVault. And, once they had accomplished that, they would only have a key for a single page of one document. Now they’d need to do it all over again for each successive page of every document.”

What does breaking the encryption have to do with breaking the system? Haven’t these people learned anything from the DeCSS story?

The quote is from:
<http://www.authentica.com/products/faq.html#pagevault>

The Web site:
<http://www.authentica.com>

Why this kind of thing won’t work, ever:
<http://www.schneier.com/…>

Bluetooth

Sometime in the 1950s, various governments realized that you could eavesdrop on data-processing information from over a hundred feet away, through walls, with a radio receiver. In the U.S., this was called TEMPEST, and preventing TEMPEST emissions in radios, encryption gear, computers, etc., was a massive military program. Civilian computers are not TEMPEST shielded, and every once in a while you see a demonstration where someone eavesdrops on a CRT from 50 feet away.

Soon it will get easier.

Bluetooth is a short-range radio communcations protocol that lets pieces of computer hardware communicate with each other. It’s an eavesdropper’s dream. Eavesdrop from up to 300 feet away with normal equipment, and probably a lot further if you try. Eavesdrop on the CRT and a lot more. Listen as a computer communicates with a scanner, printer, or wireless LAN. Listen as a keyboard communicates with a computer. (Whose password do you want to capture today?) Is anyone developing a Bluetooth-enabled smart card reader?

What amazes me is the dearth of information about the security of this protocol. I’m sure someone has thought about it, a team designed some security into Bluetooth, and that those designers believe it to be secure. But has anyone reputable examined the protocol? Is the implementation known to be correct? Are there any programming errors? If Bluetooth is secure, it will be the first time ever that a major protocol has been released without any security flaws. I’m not optimistic.

And what about privacy? Bluetooth devices regularly broadcast a unique ID. Can that be used to track someone’s movements?

The stampede towards Bluetooth continues unawares. Expect all sorts of vulnerabilities, patches, workarounds, spin control, and the like. And treat Bluetooth as a broadcast protocol, because that’s what it is.

Bluetooth:
<http://www.bluetooth.com>

A list of Bluetooth articles, none of them about security:
<http://www.zdnet.co.uk/news/specials/1999/04/bluetooth/>

One mention of security:
<http://www.zdnet.co.uk/news/2000/24/ns-16164.html> [link moved to http://news.zdnet.co.uk/story/0,,s2079718,00.html

An essay about the Bluetooth hype:
<http://www.idg.net/ic_199451_797_9-10000.html>

Recent article on TEMPEST:
<http://www.zdnet.com/zdnn/stories/news/…>

Comments from Readers

From: “Carl Ellison” <cmeacm.org>
Subject: Security of Social Security Numbers

The SSN story <http://news.cnet.com/news/0-1005-200-340248.html> misses the point. The fault isn’t in release of SSNs. That’s still a problem because it facilitates data aggregation, but the identity theft problem is the stupidity of companies and agencies that accept information anyone can acquire as a means to authenticate someone.

The net is making this worse, squared.

1. because of the net, this information is now more widely and easily available, making the attacker’s job easier.

2. to take advantage of the net, companies want to do their authentication on-line.

So, is the answer digital signatures and ID certificates? ID CAs are as subject to these two points as any other company.

This is a serious problem in the way people do business.

The assumptions being challenged here have been on shaky ground for a long time (e.g., since the rise of cities). However, with the speed of change due to the Internet, we can see the effects more clearly.

If we’re going to fix this problem, we need to fix human habits, not technology. It is human habit to believe that names work and that your knowledge of someone’s past implies that you are that person, etc. We haven’t built the human habits to respond to the new truths—that names are not valid identifiers and that everyone in the world will have very good knowledge of the past of anyone they choose to access.

From: “Mat Butler” <wingedvoidnet.com>
Subject: Full Disclosure versus hiding of information

I read with some interest your thoughts on the New York Times versus John Young (the PDF obfuscation scandal), and realized that the ideas presented by the “obfuscate sensitive information” crowd are a lot like the same ones used by the U.S. government and military when referring to classified information—only those who ‘need to know’ have access to it.

The problem comes in, in the public sector, when it’s realized that those who “need to know” have not had to submit to a security clearance investigation. This is more pronounced when the people who legitimately need access to the information to secure their systems are part of the “computer underground,” and trade information for favors from their acquaintances… and information about security vulnerabilities travels fast in such circles.

The argument, broken down in simple form:

1) Security vulnerabilities are told to people who need to know—Web server operators, system administrators, etc.

2) Trust of the people who “need to know” cannot be verified, since there’s no background checks done, and there’s no centralized information store about all sysadmins anyway. (There’s organizations like SAGE, which espouse principles like the SAGE Code of Ethics, but there’s no requirement that anyone live up to those codifications.)

3) So, essentially, you’re giving this information to people you do not trust, and cannot validate that anyone else trusts.

In addition to this, even when the person who “needs to know” -should- be notified, he or she often isn’t—this can be seen in every Windows NT installation that hasn’t gone through the steps to secure its IIS. (And, surprisingly, very few organizations actually have security notification mailing lists for their products—and even more surprisingly, the sysadmins who run or are responsible for those softwares rarely subscribe to them.) So even when the company -is- informed of the vulnerability, they often release a patch that’s never seen by those who need it.

It’s better to fling the information far and wide, and get it into as many discussion circles/professional organizations/sysadmin professional contacts as possible, since it’s the only way to ensure that the largest number of interested parties can at least know what’s out there. (Now, if they choose to not implement the knowledge, I would think that that’s probably their own, or their company’s, responsibility.)

From: Sean Lambert <seanlmetaip.checkpoint.com>
Subject: Cyber Group Network Corp.

>>From the July 15, 2000 CRYPTO-GRAM:

> The Cyber Group Network Corp claims to have a technology
> that allows you to locate a stolen computer, remotely retrieve
> information from it, and then destroy it. Sounds a bit far fetched.
> But they take “security by obscurity to new heights: “According to
> Nish Kapoor, a spokesperson for The Cyber Group Network, the patent
> pending technology that makes all this possible is being
> manufactured and developed at a remote, top-secret location
> identified only as ‘Area 74.'” Wow.
> <http://www.newsbytes.com/pubNews/00/151921.html>

Most readers will ponder the implications of this tool to users and thieves. How useful would it be? Would I prefer tracking or meltdown? If I was going to steal it, how would I disable it?

I pondered it from another angle: bored hackers with a wireless handheld and a GPS. Could someone walk by your office building and send the meltdown command to all of your computers? Could some random person track you wherever you go? Is it possible that someone would set up a Web site where your location (within 5 feet!) is displayed on a map?

But don’t worry, this tool is being developed in a location so secret that Nish Kapoor, a spokesperson for The Cyber Group Network, doesn’t even know where it is. That takes all my worries away.

Just be careful who you cut off in traffic if you have one of these in your laptop.

From: Markus Kuhn <Markus.Kuhncl.cam.ac.uk>
Subject: Re: Security Risks of Unicode

> I don’t know if anyone has considered the security implications of this.
[…]
> – Somebody uses UTF-8 or UTF-16 to encode a conventional character in a
> novel way to bypass validation checks?

Thanks for reminding your readers about the security issues surrounding the UTF-8 encoding of Unicode and ISO 10646 (UCS).

For some time, this and related issues have been of considerable concern to us folks on the linux-utf8 at nl.linux.org mailing list, who try to guide and accelerate the eventually inevitable migration of the Unix world from ASCII and ISO 8859 to UTF-8 (which the Plan9 operating system has demonstrated it successfully almost a decade ago). New UTF-8 decoders deployed in for instance GNU glibc 2.2, XFree86 4.0 xterm, and various other standard tools have been carefully designed to reject so-called overlong UTF-8 sequences as malformed sequences, in order prevent that these UTF-8 decoders can be abused by attackers to by-pass critical ASCII substring tests that are applied earlier in the processing pipeline.

It is still very unfortunate that even the latest Unicode 3.0 standard (ISBN 0-201-61633-5) contains at the end of section 3.8 on page 47 the following paragraph: “When converting from UTF-8 to a Unicode scalar value, implementations do not need to check that the shortest encoding is being used. This simplifies the conversion algorithm.”

This paragraph encourages the fielding of sloppy and dangerous UTF-8 decoders that will for example convert all of the following five UTF-8 sequences into a U+000A line-feed control character:

0xc0 0x8A
0xe0 0x80 0x8A
0xf0 0x80 0x80 0x8A
0xf8 0x80 0x80 0x80 0x8A
0xfc 0x80 0x80 0x80 0x80 0x8A

A “safe UTF-8 decoder” should reject them just like malformed sequences for two reasons: (1) It helps to debug applications if overlong sequences are not treated as valid representations of characters, because this helps to spot problems more quickly. (2) Overlong sequences provide alternative representations of characters, that could maliciously be used to bypass prior ASCII filters. For instance, a 2-byte encoded line feed (LF) would not be caught by a line counter that counts only 0x0A bytes, but it would still be processed as a line feed by an unsafe UTF-8 decoder later in the pipeline.

UTF-8 is known to be ASCII compatible, because every existing ASCII file is already a correct UTF-8 file and non-ASCII characters do not introduce additional occurrences of ASCII bytes. But from a security point of view, ASCII compatibility of UTF-8 sequences must also mean that ASCII characters are *only* allowed to be represented by ASCII bytes in the range 0x00-0x7F and not by any other byte combination. To ensure this often neglected aspect of ASCII compatibility, use only “safe UTF-8 decoders” that reject overlong UTF-8 sequences for which a shorter encoding exists, for example by substituting it with the U+FFFD replacement character.

It is not true that the check for overlong UTF-8 sequences would add any significant speed penalty or complexity to the UTF-8 decoder, as for example my implementation of the decoder found in the XFree86 4.0 xterm version illustrates. The key to understanding how to implement a safe UTF-8 decoder both simply and efficiently lies in realizing that an UTF-8 sequences is overlong if and only if it contains one of the following one or two byte long bit patterns:

1100000x (10xxxxxx)
11100000 100xxxxx (10xxxxxx)
11110000 1000xxxx (10xxxxxx 10xxxxxx)
11111000 10000xxx (10xxxxxx 10xxxxxx 10xxxxxx)
11111100 100000xx (10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx)

A UTF-8 decoder robustness test file that allows developers to check quickly an UTF-8 decoder for its safety is available on

<http://www.cl.cam.ac.uk/~mgk25/ucs/examples/…>

For instance, major Web browsers still fail the test in section 4.1.1.

More information on UTF-8 under Unix are available on

<http://www.cl.cam.ac.uk/~mgk25/unicode.html>

From: Curt Sampson <cjscynic.net>
Subject: Re: Security Risks of Unicode

I have to say I’m rather appalled by your “Security Risks of Unicode” article. You have identified a type of security vulnerability in some systems, and pointed out that Unicode may increase the incidence of this type of vulnerability, but completely missed the source of the vulnerability.

As we’ve seen from your examples of non-Unicode systems that have experienced security failures, these problems do not stem from using any particular character set or character set interpretation. They stem from doing what I like to call “validity guessing,” rather than true validity checking.

The key factor in all of these cases is that we have two separate programs (the validity checker and the application itself) using two separate algorithms to interpret data. This is what introduces the potential for a security breach: if ever the two programs do not interpret a data stream in exactly the same way (and this can easily happen if the two programs are not maintained by the same person or group), it may become possible to convince the application to do something the validator does not want to allow.

When it comes to security, guessing just isn’t good enough. This is why, when we have parameters from external sources, we use the exec() system call to run programs under Unix rather than the system() library function. We don’t pass random data to the shell for interpretation because we can never be sure how a particular implementation of a particular shell on a particular system will interpret it. (We can’t even be sure of what shell we’re using—/bin/sh may be any of a number of different programs.)

As long as we shift the blame for badly designed security systems to external standards that are not the source of the problem, we will have insecure systems. Security is something that needs to be built in to systems from the beginning, not tacked on with separate programs at the end.

From: Henry Spencer <henryspsystems.net>
Subject: Re: Security Risks of Unicode

You have a point about potential input-validation attacks in Unicode, given the much greater complexity of the character set… but I think you have missed a couple of more important points.

Trying to analyze the input string for metacharacters, odd delimiters, etc. is basically a mistake. I speak as someone who’s written code to do this, by the way—it always smelled like a kludge to me, and now I understand why.

First, prepending an input validator to a complex interpreter is a fundamentally insecure approach. Unless you are prepared to impose truly severe restrictions on which features of the interpreter are available—in which case, why bother with the interpreter at all?—the validator becomes an attempt to reinvent the interpreter’s parser and some of its semantic analysis. This is an inherently error-prone approach, as shown by various successful input-validation attacks. The validator is a complex piece of software which must achieve and maintain an exact relationship with the interpreter, which is all the more difficult if the interpreter is ill-documented (as most complex interpreters are) and constantly changing (ditto).

The right way—the *only* right way—to deal with this problem is to insist that such interpreters include a show-only mode (“process this input and tell me what it would make you do BUT DON’T DO IT”). This can be awkward for interpreters with complex programmability and interactions with their environment; it may amount to actually running the interpreter, but in a controlled and monitored environment with dummy resources. There can still be bugs—unintended differences between the show-only mode and the real mode—but if the interpreter is well organized, almost all of the show-only work is being done by the real code rather than a cheap independently-maintained fake, and there is at least a fighting chance that the behaviors will match.

(A do-only-safe-things mode is also of interest, but not as satisfactory. Definitions of safety may not match, and interpreter bugs are arguably more likely to affect the outcome.)

Second, less confidently, I have to wonder whether elaborate parsing isn’t a mistake anyway. When the context is program talking to program, it would be better to define the simplest format possible, so that parsing becomes trivial and there is no room for misunderstandings. This need not imply either binary data formats or simple semantics; for example, one can send a complex tree structure in prefix or postfix notation, one node per (text) line. Of course, all too often the option isn’t available because the format is predefined by a 700-page standard, but the possibility is worth bearing in mind.

From: Michael Smith <smithmbusa.net>
Subject: Re: Security Risks of Unicode

Speak of the devil…

Apparently, the dangers of Unicode you discussed in the latest Crypto-Gram are not far off. It’s already going into use for domain names: “Asian-language domain names now available,” at <http://www.cnn.com/2000/TECH/computing/07/17/…>.

From: Johan Ihren <johanipdc.kth.se>
Subject: Re: SOAP

> Firewalls have good reasons for blocking protocols like
> DCOM coming from untrusted sources. Protocols that sneak
> them through are not what’s wanted.

I don’t really agree with this statement. I think tunneling protocols are a rather obvious consequence of the faith in firewalls as the right solution to both having the cookie and eating it too, as in being connected to the Internet, while being safe from its dangers.

If the security one hoped to gain from deploying a firewall with the HTTP port open is somehow compromised by tunneling other stuff over HTTP then the real problem lies in the firewall (or possibly one’s faith in its abilities), not merely in the other protocol.

The point with the firewall is basically that “outside” is dangerous while “inside” is presumably insecure. But it would be too expensive to go around securing all machines and services on the inside. So instead a firewall is deployed to ensure that although the stuff on the inside is insecure it still won’t get compromised by the dangers on the outside.

If the firewall doesn’t catch “creative” new protocols tunneling on top of whatever ports/protocols are open then the firewall was a faulty solution providing a dangerous illusion of security.

In principle it doesn’t really matter whether the protocol is designed in Redmond or by anonymous crackers, although in reality stuff from Microsoft are likely to get installed on a somewhat larger fraction of machines than stuff from other sources.

So pointing out the consequences to the designer of one particular tunneling protocol is of course fine. But even at best that won’t do more than alleviate the pain of knowing that regardless of what ports are closed in the firewall, as long as some port servicing a sufficiently complicated protocol (like HTTP) is left open it will be possible have unknown communication between unknown software on the inside and unknown software on the outside with (at best) unknown results.

But wasn’t that exactly what the firewall was meant to stop?

Or, to put this another way: whatever security is provided by a firewall isn’t improved by a software design rule that forbids tunneling through the firewall over other protocols.

CRYPTO-GRAM is a free monthly newsletter providing summaries, analyses, insights, and commentaries on computer security and cryptography.

To subscribe, visit <http://www.schneier.com/crypto-gram.html> or send a blank message to crypto-gram-subscribe@chaparraltree.com. To unsubscribe, visit <http://www.schneier.com/crypto-gram-faq.html>. Back issues are available on <http://www.counterpane.com>.

Please feel free to forward CRYPTO-GRAM to colleagues and friends who will find it valuable. Permission is granted to reprint CRYPTO-GRAM, as long as it is reprinted in its entirety.

CRYPTO-GRAM is written by Bruce Schneier. Schneier is founder and CTO of Counterpane Internet Security Inc., the author of “Applied Cryptography,” and an inventor of the Blowfish, Twofish, and Yarrow algorithms. He served on the board of the International Association for Cryptologic Research, EPIC, and VTW. He is a frequent writer and lecturer on computer security and cryptography.

Counterpane Internet Security, Inc. is a venture-funded company bringing innovative managed security solutions to the enterprise.

<http://www.counterpane.com/>