Chinese dictionaries date back over two millennia to the Han Dynasty, which is a significantly longer lexicographical history than any other language. There are hundreds of dictionaries for the Chinese language, and this article discusses some of the most important.
The general term císhū (辭書, "lexicographic books") semantically encompasses "dictionary; lexicon; encyclopedia; glossary". The Chinese language has two words for dictionary: zidian (character/logograph dictionary) for written forms, that is, Chinese characters, and cidian (word/phrase dictionary), for spoken forms.
For character dictionaries, zidian (Chinese: 字典; pinyin: zìdiǎn; Wade–Giles: tzŭ⁴-tien³; lit. 'character dictionary') combines zi (字 "character, graph; letter, script, writing; word") and dian (典 "dictionary, encyclopedia; standard, rule; statute, canon; classical allusion").
For word dictionaries, cidian is interchangeably written (辭典/辞典; cídiǎn; tzʻŭ²-tien³; "word dictionary") or (詞典/词典; cídiǎn; tzʻŭ²-tien³; "word dictionary"); using cí (辭; "word, speech; phrase, expression; diction, phraseology; statement; a kind of poetic prose; depart; decline; resign"), and its graphic variant cí (詞; "word, term; expression, phrase; speech, statement; part of speech; a kind of tonal poetry"). Zidian is a much older and more common word than cidian, and Yang notes zidian is often "used for both 'character dictionary' and 'word dictionary'.
The precursors of Chinese dictionaries are primers designed for students of Chinese characters. The earliest of them only survive in fragments or quotations within Chinese classic texts. For example, the Shizhoupian (史籀篇) was compiled by one or more historians in the court of King Xuan of Zhou (r. 827 BCE – 782 BCE), and was the source of the 籀文 zhòuwén variant forms listed in the Han Dynasty Shuowen Jiezi dictionary. The Cangjiepian (倉頡篇 "Chapters of Cang Jie"), named after the legendary inventor of writing, was edited by Li Si, and helped to standardize the Small seal script during the Qin Dynasty.
The collation or lexicographical ordering of a dictionary generally depends upon its writing system. For a language written in an alphabet or syllabary, dictionaries are usually ordered alphabetically. Samuel Johnson defined dictionary as "a book containing the words of any language in alphabetical order, with explanations of their meaning" in his dictionary. But Johnson's definition cannot be applied to the Chinese dictionaries, as Chinese is written in characters or logograph, not alphabets. To Johnson, not having an alphabet is not to the Chinese's credit, as in 1778, when James Boswell asked about the Chinese characters, he replied "Sir, they have not an alphabet. They have not been able to form what all other nations have formed". Nevertheless, the Chinese made their dictionaries, and developed three original systems for lexicographical ordering: semantic categories, graphic components, and pronunciations.
The first system of dictionary organization is by semantic categories. The circa 3rd-century BCE Erya (爾雅 "Approaching Correctness") is the oldest extant Chinese dictionary, and scholarship reveals that it is a pre-Qin compilation of glosses to classical texts. It contains lists of synonyms arranged into 19 semantic categories (e.g., "Explaining Plants", "Explaining Trees"). The Han Dynasty dictionary Xiao Erya (小爾雅 "Little Erya") reduces these 19 to 13 chapters. The early 3rd century CE Guangya (廣雅 "Expanded Erya"), from the Northern Wei Dynasty, followed the Erya's original 19 chapters. The circa 1080 CE Piya (埤雅 "Increased Erya"), from the Song Dynasty, has 8 semantically-based chapters of names for plants and animals. For a dictionary user wanting to look up a character, this arbitrary semantic system is inefficient unless one already knows, or can guess, the meaning.
Two other Han Dynasty lexicons are loosely organized by semantics. The 1st century CE Fangyan (方言 "Regional Speech") is the world's oldest known dialectal dictionary. The circa 200 CE Shiming (釋名 "Explaining Names") employs paranomastic glosses to define words.
The second system of dictionary organization is by recurring graphic components or radicals. The famous 100–121 CE Shuowen Jiezi (說文解字 "Explaining Simple and Analyzing Compound Characters") arranged characters through a system of 540 bushou (部首 "section header") radicals. The 543 CE Yupian (玉篇 "Jade Chapters"), from the Liang Dynasty, rearranged them into 542. The 1615 CE Zihui (字彙 "Character Glossary"), edited by Mei Yingzuo (梅膺祚) during the Ming Dynasty, simplified the 540 Shuowen Jiezi radicals to 214. It also originated the "radical-stroke" scheme of ordering characters on the number of residual graphic strokes besides the radical. The 1627 Zhengzitong (正字通 "Correct Character Mastery") also used 214. The 1716 CE Kangxi Zidian (康熙字典 "Kangxi Dictionary"), compiled under the Kangxi Emperor of the Qing Dynasty, became the standard dictionary for Chinese characters, and popularized the system of 214 radicals. As most Chinese characters are semantic-phonetic ones (形聲字), the radical method is usually effective, thus it continues to be widely used in the present day. However, sometimes the radical of a character is not obvious. To compensate this, a "Chart of Characters that Are Difficult to Look up" (難檢字表), arranged by the number of strokes of the characters, is usually provided.
The third system of lexicographical ordering is by character pronunciation. This type of dictionary collates its entries by syllable rime and tones, and comprises the so-called "rime dictionary". The first surviving rime dictionary is the 601 CE Qieyun (切韻 "Cutting [Spelling] Rimes") from the Sui Dynasty; it became the standard of pronunciation for Middle Chinese. During the Song Dynasty, it was expanded into the 1011 CE Guangyun (廣韻 "Expanded Rimes") and the 1037 CE Jiyun (集韻 "Collected Rimes").
The clear problem with these old phonetically arranged dictionary is that the would-be user needs to have the knowledge of rime. Thus, dictionaries collated this way can only serve the literati.
A great number of modern dictionaries published today arrange their entries by pinyin or other methods of romanisation, together with a radicals index. Some of these pinyin dictionaries also contain indices of the characters arranged by number and order of strokes, by the four corner encoding (四角碼) or by the cangjie encoding (倉頡碼).
Some dictionaries employ more than one of these three methods of collation. For example, the Longkan Shoujian (龍龕手鑑) of the Liao Dynasty uses radicals, which are grouped by tone. The characters under each radical are also grouped by tone.
Besides categorizing ancient Chinese dictionaries by their methods of collation, they can also be classified by their functions. In the traditional bibliographic divisions of the imperial collection Siku Quanshu, dictionaries were classified as belonging to xiǎoxué (小學, lit. "minor learning", the premodern equivalent of "linguistics"), which was contrasted with dàxué (大學, "major learning", i.e., learning that had moral implications). Xiaoxue was divided into texts dealing with xùngǔ (訓詁, "exegesis" similar to "philology"), wénzì (文字, "script", analogous to "grammatology"), and yīnyùn (音韻, "sounds and rhymes," comparable to "phonology").
The Xungu type, sometimes called yǎshū (雅書, "word book"), comprises Erya and its descendants. These exegetical dictionaries focus on explaining meanings of words as found in the Chinese classics.
The Wenzi dictionaries, called zìshū (字書 "character book"), comprise Shuowen Jiezi, Yupian, Zihui, Zhengzitong, and Kangxi Zidian. This type of dictionary, which focuses on the shape and structure of the characters, subsumes both "orthography dictionaries", such as the Ganlu Zishu (干祿字書) of the Tang Dynasty, and "script dictionaries", such as the Liyun (隸韻) of the Song Dynasty. Although these dictionaries center upon the graphic properties of Chinese characters, they do not necessarily collate characters by radical. For instance, Liyun is a clerical script dictionary collated by tone and rime.
The Yinyun type, called yùnshū (韻書 "rime book"), focuses on the pronunciations of characters. These dictionaries are always collated by rimes.
While the above traditional pre-20th-century Chinese dictionaries focused upon the meanings and pronunciations of words in classical texts, they practically ignored the spoken language and vernacular literature.
The Kangxi Zidian served as the standard Chinese dictionary for generations, is still published and is now online. Contemporary lexicography is divisible between bilingual and monolingual Chinese dictionaries.
The foreigners who entered China in late Ming and Qing Dynasties needed dictionaries for different purposes than native speakers. Wanting to learn Chinese, they compiled the first grammar books and bilingual dictionaries. Westerners adapted the Latin alphabet to represent Chinese pronunciation, and arranged their dictionaries accordingly.
Two Bible translators edited early Chinese dictionaries. The Scottish missionary Robert Morrison wrote A Dictionary of the Chinese Language (1815–1823). The British missionary Walter Henry Medhurst wrote a Hokkien (Min Nan) dialect dictionary in 1832 and the Chinese and English Dictionary in 1842. Both were flawed in their representation of pronunciations, such as aspirated stops. In 1874 the American philologist and diplomat Samuel Wells Williams applied the method of dialect comparison in his dictionary, A Syllabic Dictionary of the Chinese Language, which refined distinctions in articulation and gave variant regional pronunciations in addition to standard Peking pronunciation.
The British consular officer and linguist Herbert Giles criticized Williams as "the lexicographer not for the future but of the past", and took nearly twenty years to compile his A Chinese-English Dictionary (1892, 1912), one that Norman calls "the first truly adequate Chinese–English dictionary". It contained 13,848 characters and numerous compound expressions, with pronunciation based upon Beijing Mandarin, which it compared with nine southern dialects such as Cantonese, Hakka, and Fuzhou dialect. It has been called "still interesting as a repository of late Qing documentary Chinese, although there is little or no indication of the citations, mainly from the Kangxi zidian." Giles modified the Chinese romanization system of Thomas Francis Wade to create the Wade-Giles system, which was standard in English speaking countries until 1979 when pinyin was adopted. The Giles dictionary was replaced by the 1931 dictionary of the Australian missionary Robert Henry Mathews. Mathews' Chinese-English Dictionary, which was popular for decades, was based on Giles and partially updated by Y.R. Chao in 1943 and reprinted in 1960.
Trained in American structural linguistics, Yuen Ren Chao and Lien-sheng Yang wrote a Concise Dictionary of Spoken Chinese (1947), that emphasized the spoken rather than the written language. Main entries were listed in Gwoyeu Romatzyh, and they distinguished free morphemes from bound morphemes. A hint of non-standard pronunciation was also given, by marking final stops and initial voicing and non-palatalization in non-Mandarin dialects.
The Swedish sinologist Bernhard Karlgren wrote the seminal (1957) Grammata Serica Recensa with his reconstructed pronunciations for Middle Chinese and Old Chinese.
Chinese lexicography advanced during the 1970s. The translator Lin Yutang wrote the semantically sophisticated Lin Yutang's Chinese-English Dictionary of Modern Usage (1972) that is now available online. The author Liang Shih-Chiu edited two full-scale dictionaries: Chinese-English with over 8,000 characters and 100,000 entries, and English-Chinese with over 160,000 entries.
The linguist and professor of Chinese, John DeFrancis edited a groundbreaking Chinese–English dictionary (1996) giving more than 196,000 words or terms alphabetically arranged in a single-tier pinyin order. The user can therefore in a straightforward way find a term whose pronunciation is known rather than searching by radical or character structure, the latter being a 2-tiered approach. This project had long been advocated by another pinyin proponent, Victor H. Mair.
When the Republic of China began in 1912, educators and scholars recognized the need to update the 1716 Kangxi Zidian. It was thoroughly revised in the (1915) Zhonghua Da Zidian (中華大字典 "Comprehensive Chinese-Character Dictionary"), which corrected over 4,000 Kangxi Zidian mistakes and added more than 1,000 new characters. Lu Erkui's (1915) Ciyuan (辭源 "Sources of Words") was a groundbreaking effort in Chinese lexicography and can be considered the first cidian "word dictionary".
Shu Xincheng's (1936) Cihai (辭海 "Sea of Words") was a comprehensive dictionary of characters and expressions, and provided near-encyclopedic coverage in fields like science, philosophy, history. The Cihai remains a popular dictionary and has been frequently revised.
The (1937) Guoyu cidian (國語辭典 "Dictionary of the National Language") was a four-volume dictionary of words, designed to standardize modern pronunciation. The main entries were characters listed phonologically by Zhuyin Fuhao and Gwoyeu Romatzyh. For example, the title in these systems is ㄍㄨㄛㄩ ㄘㄉ一ㄢ and Gwoyeu tsyrdean.
Wei Jiangong's (1953) Xinhua Zidian (新华字典 "New China Character Dictionary") is a pocket-sized reference, alphabetically arranged by pinyin. It is the world's most popular reference work. The 11th edition was published in 2011.
Lü Shuxiang's (1973) Xiandai Hanyu Cidian (现代汉语词典 "Contemporary Chinese Dictionary") is a middle-sized dictionary of words. It is arranged by characters, alphabetized by pinyin, which list compounds and phrases, with a total 56,000 entries (expanded to 70,000 in the 2016 edition). Both the Xinhua zidian and the Xiandai Hanyu cidian followed a simplified scheme of 189 radicals.
Two outstanding achievements in contemporary Chinese lexicography are the (1986–93) Hanyu Da Cidian (漢語大詞典 "Comprehensive Dictionary of Chinese Words") with over 370,000 word and phrase entries listed under 23,000 different characters; and the (1986–89) Hanyu Da Zidian (漢語大字典 "Comprehensive Dictionary of Chinese Characters") with 54,678 head entries for characters. They both use a system of 200 radicals.
In recent years, the computerization of Chinese has allowed lexicographers to create dianzi cidian (電子詞典/电子词典 "electronic dictionaries") usable on computers, PDAs, etc. There are proprietary systems, such as Wenlin Software for learning Chinese, and there are also free dictionaries available online. After Paul Denisowski started the volunteer CEDICT (Chinese–English dictionary) project in 1997, it has grown into a standard reference database. The CEDICT is the basis for many Internet dictionaries of Chinese, and is included in the Unihan Database.
Chinese publishing houses print diverse types of zhuanke cidian (專科詞典/专科词典 "specialized dictionary"). One Chinese dictionary bibliography lists over 130 subject categories, from "Abbreviations, Accounting" to "Veterinary, Zoology." The following examples are limited to specialized dictionaries from a few representative fields.
Dictionaries of Ancient Chinese give definitions, in Modern Chinese, of characters and words found in the pre-Modern (before 1911) Chinese literature. They are typically organized by pinyin or by Zihui radicals, and give definitions in order of antiquity (most ancient to most recent) when several definitions exist. Quotes from the literature exemplifying each listed meaning are given. Quotes are usually chosen from the pre-Han Classical literature when possible, unless the definition emerged during the post-Classical period. Dictionaries intended for historians, linguists, and other classical scholars will sometimes also provide Middle Chinese fanqie readings and/or Old Chinese rime groups, as well as bronze script or oracle bone script forms.
While dictionaries published in Mainland China intended for study or reference by high school/college students are generally printed in Simplified Chinese, dictionaries intended for scholarly research are set in Traditional Chinese.
Twenty centuries ago, the Fangyan was the first Chinese specialized dictionary. The usual English translation for fangyan (方言 lit. "regional/areal speech") is "dialect", but the language situation in China is said to be uniquely complex. In the "dialect" sense of English dialects, Chinese has Mandarin dialects, yet fangyan also means "non-Mandarin languages, mutually unintelligible regional varieties of Chinese", such as Cantonese and Hakka. Some linguists like John DeFrancis prefer the translation "topolect", which are very similar to independent languages. (See also- Protection of the Varieties of Chinese.) The Dictionary of Frequently-Used Taiwan Minnan is an online dictionary of Taiwanese Hokkien. Here are some general fangyan cidian (方言词典 "topolect dictionary") examples.
Chinese has five words translatable as "idiom": chengyu (成語/成语 "set phrase; idiom"), yanyu (諺語/谚语 proverb; popular saying, maxim; idiom"), xiehouyu (歇後語/歇后语 "truncated witticism, aposiopesis; enigmatic folk simile"), xiyu (習語/习语 "idiom"), and guanyongyu (慣用語/惯用语 "fixed expression; idiom; locution"). Some modern dictionaries for idioms are:
The Chinese language adopted a few foreign wailaici (外來詞/外来词 "loanwords") during the Han Dynasty, especially after Zhang Qian's exploration of the Western Regions. The lexicon absorbed many Buddhist terms and concepts when Chinese Buddhism began to flourish in the Southern and Northern Dynasties. During the late 19th century, when Western powers forced open China's doors, numerous loanwords entered Chinese, many through the Japanese language. While some foreign borrowings became obsolete, others became indispensable terms in modern vocabulary.
The 20th century saw the rapid progress of the studies of the lexicons found in the Chinese vernacular literature, which includes novels, dramas and poetry. Important works in the field include:
Employing corpus linguistics and lists of Chinese characters arranged by frequency of usage (e.g., List of Commonly Used Characters in Modern Chinese), lexicographers have compiled dictionaries for learners of Chinese as a foreign language. These specialized Chinese dictionaries are available either as add-ons to existing publications like Yuan's 2004 Pocket Dictionary and Wenlin or as specific ones like
Victor H. Mair lists eight adverse features of traditional Chinese lexicography, some of which have continued up to the present day: (1) persistent confusion of spoken word with written graph; (2) lack of etymological science as opposed to the analysis of script; (3) absence of the concept of word; (4) ignoring the script's historical developments in the oracle bones and bronze inscriptions; (5) no precise, unambiguous, and convenient means for specifying pronunciations; (6) no standardized, user-friendly means for looking up words and graphs; (7) failure to distinguish linguistically between vernacular and literary registers, or between usages peculiar to different regions and times; and (8) open-endedness of the writing system, with current unabridged character dictionaries containing 60,000 to 85,000 graphs..