This article provides insufficient context for those unfamiliar with the subject. Please help improve the article by providing more context for the reader. (January 2023) (Learn how and when to remove this template message)

GB 12345,[1] entitled Code of Chinese ideogram set for information interchange supplementary set (Chinese: 信息交換用漢字編碼字符集 輔助集), is a Traditional Chinese character set standard established by China, and can be thought as the traditional counterpart of GB 2312. It is used as an encoding of traditional Chinese characters, although it is not as commonly used as Big5. It has 6,866 characters, and has no relationship nor compatibility with Big5 and CNS 11643.

Characters

Characters in GB 12345 are arranged in a 94×94 grid (as in ISO/IEC 2022), and the two-byte code point of each character is expressed in the qu-wei form, which specifies a row (qu 区) and the position of the character within the row (cell, wei 位).

The rows (numbered from 1 to 94) contain characters as follows:[2][3]

The rows 10–15 and 90–94 are unassigned.

Encodings

The specification for the ISO-2022-CN-EXT encoding states that the sequence ESC $ ) followed by a yet-undetermined byte (shown by the placeholder <X12345>) can be used to indicate GB 12345 characters, similarly to the sequence ESC $ ) A (also with the ESC $ ) prefix) indicating GB 2312, but only after it receives a registration in the ISO-IR registry specifying what the final byte of the sequence is.[4] As of 2023, no such registration exists.[5] However, the same Request for Comments also defines the encoding label CN-GB-12345 for GB 12345 used with ASCII in a manner analogous to EUC-CN.[4]

Inclusion of non-standard Traditional Chinese characters

GB/T 12345 includes a few traditional characters which is different from the table of correspondences between Simplified Chinese characters and Traditional Chinese characters in the standard Table of General Standard Chinese Characters.

GB 12345 and Unicode

The characters in GB 12345 were taken as one of the sources for the Han unification which led to the unified set of CJK characters in the initial ISO 10646/Unicode standard. All the 6,866 Chinese characters were incorporated.

See also

References

  1. ^ "GB/T 12345-1990: Code of Chinese ideogram set for information interchange--Supplementary set". Standardization Administration of the People's Republic of China. Retrieved 2022-10-01.
  2. ^ Lunde, Ken (2009). CJKV Information Processing: Chinese, Japanese, Korean & Vietnamese Computing (2nd ed.). Sebastopol, CA: O'Reilly. pp. 150–151. ISBN 978-0-596-51447-1.
  3. ^ Chung, Jaemin (2014-12-20). "GB 12052-89 to Unicode table".
  4. ^ a b Zhu, HF.; Hu, DY.; Wang, ZG.; Kao, TC.; Chang, WCH.; Crispin, M. (1996). Chinese Character Encoding for Internet Messages. IETF. doi:10.17487/RFC1922. RFC 1922. Note: Currently, there are some GB sets that have not been registered in ISO. Here <X7589>, <X7590>, <X12345>, <X13131> and <X13132> represent the final character that will be assigned by ISO for those sets. These GB sets shall only be used once these final characters are assigned.
  5. ^ ISO-IR: ISO/IEC International Register of Coded Character Sets To Be Used With Escape Sequences (PDF) (Registry Index). ITSCJ/IPSJ.