The term IDN (internationalized domain name) refers to all domains that contain characters not included in the standard ASCII (American Standard Code for Information Interchange) character set, including domains that contain umlauts (vowel modifications like ä, ö or ü). The problem with special characters is that e.g. English-speaking users not only have to be familiar with, e.g. the Greek alphabet, but must also be able to type the letters – and these must correspond to the desired English letters. Keyboards in different countries don’t contain the same special characters, which makes it difficult to directly enter any characters used by another country. As there is no exact agreement between scripts, the spelling of some words has to be guessed. In order to solve this problem with different umlauts and special characters between different languages, the internationalized domain name was introduced.
Conversion of IDNs to ASCII form
The domain name is marked as a converted IDN by using the prefix “xn” in combination with a double dash. In principle, an ASCII character sequence is generated for every recognized special character. When accessing an IDN, the domain is converted to punycode in the background and thereby coded into a form that is compatible with ASCII. All characters that are not included in the ASCII data set are extracted from the domain and added at the end. This procedure enables browsers to decipher the IDN domain without running into problems.
You can use this DENIC conversion tool to see what the converted IDN would look like. Using the example domain www.mr-müller.de, the conversion would be:
The system for converting internationalized domain names was introduced in 2003. The system is called Internationalizing Domain Names for Applications (short: IDNA2003).
System revision for translating IDNs
In the original translation system IDNA2003, the domain name was initially normalized using the nameprep procedure. The normalization was carried out by replacing all capital letters with lowercase letters and transcribing equivalent characters. So, for example, the character “ß” was transcribed to the specified equivalent “ss”.
With the revised version IDNA2008, the normalization process was no longer part of the IDNA, but was carried out by the user interface. The IDNA2008 revision introduced in 2010 was intended to solve the problems resulting from IDNA2003: it expanded the range of valid characters in domain names and established an automated process for updating to future versions of the unicode standard. The IDNA2008 version also provided a clear definition of the concept of a valid domain name, so that registrants could immediately see which character sequence of the domain name would be registered.
The process for several common domain names is the same in IDNA2008 as in IDNA2003. Both IDNA2003 and IDNA2008 transform a unicode domain name into the punycode version. However, the revision does uncover a series of incompatibilities with IDNA2003. This is why some browser hitches may still occur during the conversion process, despite the introduction of IDNA2008.
IDN domains are not always possible
There are still TLDs for which registration is only possible using ASCII-form characters. The respective registration requirements of the registries stipulate whether the desired TLD can be registered with special characters or umlauts.
DENIC, the registry for .de domains, for example, currently allows 93 special characters.
The advantages of IDNs
- Especially for umlauts (e.g. ä, ö and ü), the transcription (e.g. ae, oe and ue) is stylistically bulky and unattractive. And established brand names are often very reluctant to forego the correct spelling. These are two arguments that already speak volumes for using IDNs.
- Umlaut domains often haven’t been registered yet and are therefore still widely available for registration.
The disadvantages of IDNs
- Special characters are not included in all languages – for example, the German umlauts ä, ö and ü. For users in non-German language circles, this leads to a potentially problematic situation: the character can’t be found on the keyboard and therefore can’t be typed.
- Umlaut domains often cause technical problems for content management systems, as well as for email programs. In order to prevent these, the software on both the sender and addressee side must be continuously updated. Nevertheless, especially with freemail services, it still happens sometimes that an email can’t be sent to IDN addresses.
- Complications may also arise with regard to search engine optimization. Older content management systems do not reliably recognize spellings with special characters and webmasters are also not always familiar with umlauts and other special forms. If backlinks and social shares do not work, rankings suffer.
InterNetX recommends the following when using IDNs
When using IDNs, we recommend implementing a multi-domain strategy. This means that you should register more than one domain for a company or a project and not rely on the success of a single domain that uses an umlaut or special character. In order to make sure that problem areas are avoided, it’s a good idea to own the IDN as well as the transcribed version. The version without special characters can then be redirected to the IDN.