ABSTRACT

The encoding of a script for a language must consider at least three aspects for its handling by computer systems. First, all symbols used in the language must be defined and recognized by these systems. Second, the display of the text must follow the writing convention of that language. Third, the encoding method must also consider the sequencing of symbols for easy sorting and indexing for text processing. In this chapter, we will first introduce the general principles of encodings for scripts in computer systems. This includes the analysis of a script and how to assign code points to the symbols used in the script. We will then proceed with a more detailed analysis of the Chinese writing system and how it is represented in computer systems. Finally, we introduce some new methods for alternative representation of Chinese characters in computer systems.