Schneier on Security
A blog covering security and security technology.
« Satellite Tracking Data Made Secret |
| Terrorist Risks from Unmanned Aircraft »
March 13, 2005
The Unicode community is working on fixing the security vulnerabilities I talked about here and here. They have a draft technical report that they're looking for comments on. A solution to these security problems will take some concerted efforts, since there are many different kinds of issues involved. (In some ways, the "paypal.com" hack is one of the simpler cases.)
Posted on March 13, 2005 at 9:31 AM
• 4 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
As long as something is maintained to be "backwards-compatible" it will have a weak link that will usually be the first to break.
Opera Software were first reluctant to fix this "bug", since they in fact only implemented it as described. In current betas it is fixed in the current way:
- Some domains, which are considered "safe", will render IDN URLs fine (for instance .no-domains. The only allowed characters here are the latin alfapeth plus the Norwegian characters ���)
- Other domains (such as .com which will allow any unicode URL), will render IDNs encoded, such as www.xn--pypal-4ve.com instead of www.paypal.com.
An exceptable solution, rendering some phishing attacks difficult, IMO.
There are hacks with unicode that are not dependent on backward compatibility issues. An example is the paypal-hack, which exploits the fact that there are two "a"s in the unicode-space that look exactly the same.
It's really impossible to "fix" Unicode as a whole. There are so many writing systems and few people--if any--have an understanding of all of them.
The first step to solving the security problem with IDN, I think, is to define zones of allowable codepoints. Each zone would encompass a writing system--Latin, Cyrillic, Arabic, etc. A domain name would only have characters within a given zone. Names in a zone would be governed by a zone-specific set of rules, to be determined by people from counrties that uses the script.
For example, the Basic Latin zone could be defined as containing only letters used in the major European languages. When the browser encounters a name with a Cyrillic letter, it'll see that it does not fall in that zone and will look at the definitions of other zones.
The Basic Cyrillic zone might be defined as containing letters used in modern East Slavic languages plus the basic Latin set. A rule in that zone might stipulate that Cyrillic letters cannot be immediately next to a Latin one. A name like the one used in the Secuna exploit would thus fail and the browser would move on looking for another possible zone to place the name. After it has tested the name against all the zones it know, it would give up and display the name in punycode, and maybe throw up an alert message.
The idea is to break a large problem into smaller ones, which we can then solve one by one, incrementally.
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.