Thursday, November 29, 2012

Joseph D. Becker

Unicode is intended to address the need for a workable, reliable world text encoding. Unicode could be roughly described as "wide-body ASCII" that has been stretched to 16 bits to encompass the characters of all the world's living languages. In a properly engineered design, 16 bits per character are more than sufficient for this purpose.
The idea of expanding the basis for character encoding from 8 to 16 bits is so sensible, indeed so obvious, that the mind initially recoils from it.
The major catch is simply that the 16-bit approach requires перестройка (perestroika), i.e. restructuring our old ways of thinking. Rather than struggling to salvage obsolete. 8-bit encodings via horrendous "extension" contrivances, we need to recognize that the current absence of a standard international/multilingual encoding is a unique opportunity to rethink and revitalize the design concepts behind text encoding.

1 comment:

  1. Unicode 88

    by Joseph D. Becker

    1988

    http://www.unicode.org/history/unicode88.pdf

    ReplyDelete