Punycode



         


Unicode
series
Unicode
Unicode Consortium
UCS
UTF-7
UTF-8
UTF-16
UTF-32
SCSU
Punycode
Bi-directional text
BOM
Han unification
Unicode and HTML


Punycode, defined in RFC 3492, is a self-proclaimed "Bootstring encoding" of Unicode strings into the limited character set supported by the Domain Name System. The encoding is used as part of IDNA, which is a system enabling the use of internationalized domain names in all languages supported by Unicode, where the burden of translation lies entirely with the user application (e.g., web browser).

For example, bücher becomes bcher-kva in Punycode, and therefore the domain name bücher.ch would be represented as xn--bcher-kva.ch in IDNA.

Punycode is designed to work across all script systems, and to be self-optimising by attempting to adapt to the character sets within the string as it operates. It is optimised for the case where the string is composed of zero or more ASCII characters and in addition characters from only one other script system, but will cope with any arbitrary Unicode string. Note that for DNS use, the domain name string is assumed to have been normalised using Nameprep before being Punycoded, and that the DNS protocol sets limits on the acceptable lengths of the output Punycode string.

Punycode has been adopted by the national registrars of Germany, Austria and Switzerland starting on March 1, 2004.

[Top]

See also

[Top]




  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License