Comments

Israel Torres March 14, 2005 9:20 AM

As long as something is maintained to be “backwards-compatible” it will have a weak link that will usually be the first to break.

Israel Torres

Johannes March 15, 2005 5:34 AM

Opera Software were first reluctant to fix this “bug”, since they in fact only implemented it as described. In current betas it is fixed in the current way:
– Some domains, which are considered “safe”, will render IDN URLs fine (for instance .no-domains. The only allowed characters here are the latin alfapeth plus the Norwegian characters ???)
– Other domains (such as .com which will allow any unicode URL), will render IDNs encoded, such as http://www.xn--pypal-4ve.com instead of http://www.paypal.com.
An exceptable solution, rendering some phishing attacks difficult, IMO.

@Torres
There are hacks with unicode that are not dependent on backward compatibility issues. An example is the paypal-hack, which exploits the fact that there are two “a”s in the unicode-space that look exactly the same.

Chung Leong March 15, 2005 1:06 PM

It’s really impossible to “fix” Unicode as a whole. There are so many writing systems and few people–if any–have an understanding of all of them.

The first step to solving the security problem with IDN, I think, is to define zones of allowable codepoints. Each zone would encompass a writing system–Latin, Cyrillic, Arabic, etc. A domain name would only have characters within a given zone. Names in a zone would be governed by a zone-specific set of rules, to be determined by people from counrties that uses the script.

For example, the Basic Latin zone could be defined as containing only letters used in the major European languages. When the browser encounters a name with a Cyrillic letter, it’ll see that it does not fall in that zone and will look at the definitions of other zones.

The Basic Cyrillic zone might be defined as containing letters used in modern East Slavic languages plus the basic Latin set. A rule in that zone might stipulate that Cyrillic letters cannot be immediately next to a Latin one. A name like the one used in the Secuna exploit would thus fail and the browser would move on looking for another possible zone to place the name. After it has tested the name against all the zones it know, it would give up and display the name in punycode, and maybe throw up an alert message.

The idea is to break a large problem into smaller ones, which we can then solve one by one, incrementally.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.