KOI8-R



         


KOI8-R is an 8-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet. It also happens to cover Bulgarian. A related encoding is KOI8-U, for Ukrainian.

KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In recent times, both might eventually give way to Unicode.

In Russian, KOI8 stands for Kod Obmena Informatsiey, 8 bit (Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the natural Cyrillic alphabetical order as in ISO 8859-5. Although this may seem unnatural, it has the useful property that if the 8th bit is stripped, the text can still be read (or at least deciphered) in case-reversed transliteration on an ordinary ASCII terminal. For instance, "Русский Текст" in KOI8-R becomes rUSSKIJ tEKST ("Russian Text") if the 8th bit is stripped.

KOI8-R
x0x1x2x3x4x5x6x7x8x9xAxBxCxDxExF
0xunused
1x
2xSP!"#$%&'()*+,-./
3x 0 123456789:;<=>?
4x@ABCDEFGHIJKLMNO
5xPQRSTUVWXYZ[\&#93^_
6x`abcdefghijklmno
7xpqrstuvwxyz{|}~
8x
9xNBSP°²·÷
Axё
BxЁ©
Cxюабцдефгхийклмно
Dxпярстужвьызшэщчъ
ExЮАБЦДЕФГХИЙКЛМНО
FxПЯРСТУЖВЬЫЗШЭЩЧЪ

In the table above, 20 is the regular SPACE character, and 9A is the NO-BREAK SPACE.

Although RFC 1489 says that character 95 should be U+2219 (∙), it may also be U+2022 (•) to match the bullet character in Windows-1251.

[Top]




  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License