Bruce Schneier

 
 

Schneier on Security

A blog covering security and security technology.

« Satellite Tracking Data Made Secret | Main | Terrorist Risks from Unmanned Aircraft »

March 13, 2005

Fixing Unicode

The Unicode community is working on fixing the security vulnerabilities I talked about here and here. They have a draft technical report that they're looking for comments on. A solution to these security problems will take some concerted efforts, since there are many different kinds of issues involved. (In some ways, the "paypal.com" hack is one of the simpler cases.)

Posted on March 13, 2005 at 9:31 AM4 CommentsView Blog Reactions

To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.

Comments

We have a talk on Unicode this Wednesday. If you life in the San Francisco bay area you can attend for free:

http://sfbayacm.org/activities.html#this_month

Posted by: Martin Stein at March 13, 2005 10:04 PM


As long as something is maintained to be "backwards-compatible" it will have a weak link that will usually be the first to break.

Israel Torres

Posted by: Israel Torres at March 14, 2005 9:20 AM


Opera Software were first reluctant to fix this "bug", since they in fact only implemented it as described. In current betas it is fixed in the current way:
- Some domains, which are considered "safe", will render IDN URLs fine (for instance .no-domains. The only allowed characters here are the latin alfapeth plus the Norwegian characters זרו)
- Other domains (such as .com which will allow any unicode URL), will render IDNs encoded, such as www.xn--pypal-4ve.com instead of www.paypal.com.
An exceptable solution, rendering some phishing attacks difficult, IMO.

@Torres
There are hacks with unicode that are not dependent on backward compatibility issues. An example is the paypal-hack, which exploits the fact that there are two "a"s in the unicode-space that look exactly the same.

Posted by: Johannes at March 15, 2005 5:34 AM


It's really impossible to "fix" Unicode as a whole. There are so many writing systems and few people--if any--have an understanding of all of them.

The first step to solving the security problem with IDN, I think, is to define zones of allowable codepoints. Each zone would encompass a writing system--Latin, Cyrillic, Arabic, etc. A domain name would only have characters within a given zone. Names in a zone would be governed by a zone-specific set of rules, to be determined by people from counrties that uses the script.

For example, the Basic Latin zone could be defined as containing only letters used in the major European languages. When the browser encounters a name with a Cyrillic letter, it'll see that it does not fall in that zone and will look at the definitions of other zones.

The Basic Cyrillic zone might be defined as containing letters used in modern East Slavic languages plus the basic Latin set. A rule in that zone might stipulate that Cyrillic letters cannot be immediately next to a Latin one. A name like the one used in the Secuna exploit would thus fail and the browser would move on looking for another possible zone to place the name. After it has tested the name against all the zones it know, it would give up and display the name in punycode, and maybe throw up an alert message.

The idea is to break a large problem into smaller ones, which we can then solve one by one, incrementally.

Posted by: Chung Leong at March 15, 2005 1:06 PM


Post a comment



Real names aren't required, but please give us something to call you. Conversations among several people called "Anonymous" get too confusing.



E-mail is optional and will not be displayed on the site.


Remember Me?


Powered by Movable Type. Photo at top by Steve Woit.

Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.

 
Bruce Schneier