CJK
CJK is a collective term for Chinese, Japanese, and Korean, which comprise the largest of East Asian languages. The term is used in the field of software and communications internationalization.These languages all share the fact that their writing systems are based partly or entirely on Chinese characters -- Hanzi in Chinese, Kanji in Japanese, and Hanja in Korean. Chinese requires between 4000 characters for a basic vocabulary to 40,000 characters for reasonably complete coverage. Whereas Japanese and Korean use fewer characters -- complete literacy in Japan can be expected with 2000 characters -- idiosyncratic use of Chinese characters in proper names requires many more. This number of characters cannot fit in the 256-character code space of 8-bit encodings, and therefore requires at least a 16-bit fixed width character encoding or multi-byte variable-length encodings. 16-bit fixed width encodings, such as Unicode through version 2.0, are now deprecated due to the requirement that software in China support the GB18030 character set.
CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such as pinyin, bopomofo, hiragana, katakana, and Hangul.
CJK character encodings include:
- Big5
- Unicode
- GB2312
- GB18030 (the mandated standard in the People's Republic of China)
- Shift-JIS
- ISO 2022-JP
- EUC-JP
- EUC-KR
The term CJKV is used to mean CJK plus Vietnamese, which used Chinese characters prior to adopting a written language solely on Romanization.
See also:
References
- This article was originally based on material from FOLDOC, used with permission
- DeFrancis, John. 1990. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0824810686
- Hannas, William. C. 1997. Asia's Orthographic Dilemma. University of Hawaii Press. ISBN 082481892X (paperback); ISBN 0824818423 (hardcover)
- Lunde, Ken. 1998. CJKV Information Processing. O'Reilly & Associates. ISBN 1565922247
External links
- http://www.praxagora.com/lunde/cjk_inf.html