|Greater China, Singapore, Christmas Island, Han diaspora communities, Ferghana Valley (Uzbekistan) and Chu Valley (Kyrgyzstan) (Dungans), Russian Far East (Tazs)|
Map of non-Bai Sinitic languages in China
The Sinitic languages[a] (漢語族/汉语族), often synonymous with "Chinese languages", are a group of East Asian analytic languages that constitute the major branch of the Sino-Tibetan language family. It is frequently proposed that there is a primary split between the Sinitic languages and the rest of the family (the Tibeto-Burman languages). This view is rejected by a number of researchers but has found phylogenetic support among others. The Greater Bai languages, whose classification is difficult, may be an offshoot of Old Chinese and thus Sinitic; otherwise Sinitic is defined only by the many varieties of Chinese unified by a shared historical background, and usage of the term "Sinitic" may reflect the linguistic view that Chinese constitutes a family of distinct languages, rather than variants of a single language.[b]
Over 91% of the Chinese population speaks a Sinitic language. The total speakers of the Chinese macrolanguage is 1,521,943,700, of which about 73.5% (1,118,584,040) speak a Mandarin variety. The estimated number of speakers globally, both native and secondary, of the larger branches of the Sinitic languages are listed below (2018–19):
Further information: List of varieties of Chinese
Dialectologist Jerry Norman estimated that there are hundreds of mutually unintelligible Sinitic languages. They form a dialect continuum in which differences generally become more pronounced as distances increase, though there are also some sharp boundaries. The Sinitic languages can be divided into Macro-Bai languages and Chinese languages, and the following is one of many potential ways of subdividing these languages. Some varieties, such as Shaozhou Tuhua, are hard to classify, and thus are not included in the following briefs.
This is a language family first proposed by linguist Zhengzhang Shangfang, and was expanded to include Longjia and Luren. It likely split off from the rest of Sinitic during the Old Chinese period. The languages included are all considered minority languages in China and are spoken in the Southwest. The languages are:
All other Sinitic languages henceforth would be considered Chinese.
The Chinese branch of the family is classified into at least 7 main families. These families are classified based on five main evolutionary criteria:
The varieties within one family may not be mutually intelligible with each other. For instance, Wenzhounese and Ningbonese are not highly mutually intelligible. The Language Atlas of China identifies ten groups:
Where Jin, Hui and Pinghua and Tuhua are not considered part of the 7 traditional groups.
Mandarinic languages are used in the Western Regions, the Southwest, Huguang, Inner Mongolia, Central Plains and the Northeast, by around three-quarters of the Sinitic-speaking population. Historically, the prestige variety has always been Mandarin, which is still reflected to this day in Standard Chinese. In fact, Standard Chinese is now an official language of the Republic of China, People's Republic of China, Singapore and United Nations. Repopulation efforts, such as that of the Qing Dynasty in the Southwest, tended to involve Mandarinic speakers. Classification of Mandarinic lects has undergone several significant changes, though nowadays it is commonly divided as such, based on the distribution of the historical checked tone:
as well as other lects, which do not neatly fall into these categories, such as Mandarinic Junhua varieties.
Mandarinic varieties can be defined by their universally lost -m final, low number of tones, and smaller inventory of classifiers, among other features. Mandarinic languages also often have rhotic erhua rimes, though the amount of its use may vary between lects. Loss of checked tone is an often cited criterion for Mandarinic languages, though lects such as Yangzhounese and Taiyuannese show otherwise.
Northeastern Mandarin is spoken in Heilongjiang, Jilin, most of Liaoning and northeastern Inner Mongolia, whereas Beijing Mandarin is spoken in northern Hebei, most of Beijing, parts of Tianjin and Inner Mongolia. The two families' most notable features are the heavy use of rhotic erhua and seemingly random distribution of the dark checked tone, and generally having four tones with the contours of high flat, rising, dipping, and falling.
Northeastern Mandarin, especially in Heilongjiang, contains many loanwords from Russian.
Northeastern Mandarin lects can be divided into three main groups, namely Hafu (including Harbinnese and Changchunnese), Jishen (including Jilinnese and Shenyangnese), and Heisong. Notably, the extinct Taz language of Russia is also a Northeastern Mandarin language. Beijing is sometimes included in Northeastern Mandarin due to its distribution of the historical dark checked tone, though is listed as its own group by others, often due to its more regular light checked tones.
Jilu Mandarin is spoken in southern Hebei and western Shandong, and is often represented with Jinannese. Notable cities that use Jilu Mandarin lects include Cangzhou, Shijiazhuang, Jinan and Baoding. Characteristically Jilu Mandarin features include merging the dark checked into the dark level tone, the light checked into light level or departing based on the manner of articulation of the initial, and vowel breaking in tong rime series' (通攝) checked-tone words, among other features.
Jilu Mandarin can be classified into Baotang, Shiji, Canghui and Zhangli. Zhangli is of note due to its preservation of a separate checked tone.
Jiaoliao Mandarin is spoken in the Jiaodong and Liaodong Peninsulae, which includes the cities of Dalian and Qingdao, as well as several prefectures along the China-Korea border. Like Jilu Mandarin, its light checked tone is merged into light level or departing based on the manner of articulation of the initial, though its dark checked is merged into the rising. Its ri initial (日母) terms are pronounced with a null initial (apart from open zhi rime series (止攝開口) finals), unlike the /ʐ/ of Northern and Beijing Mandarin.
Based on, for example, the pronunciation of the palatalized jian initial (見母), Jiaoliao Mandarin can be divided into Qingzhou, Denglian and Gaihuan areas.
|交||ciau||ciau||tɕiɔ||tɕiɔ||to hand in|
Central Plains Mandarin is spoken in the Central Plains of Henan, southwestern Shanxi, southern Shandong and northern Jiangsu, as well as most of Shaanxi, southern Ningxia and Gansu and southern Xinjiang, in famous cities such as Kaifeng, Zhengzhou, Luoyang, Xuzhou, Xi'an, Xining and Lanzhou. Central Plains Mandarin lects merge the historical checked tones with a lesser muddy (次濁) and clear (清) initial together with the rising tone, and those with a fully muddy (全濁) initial are merged with the light level tone.
Lanyin Mandarin, spoken in northern Ningxia, parts of Gansu and northern Xinjiang, is sometimes grouped together with Central Plains Mandarin due to its merged lesser light and dark checked tones, though it is realised as a departing tone.
Subdivision of Central Plains Mandarin is not fully agreed upon, though one possible subdivision sees 13 divisions, namely Xuhuai, Zhengkai, Luosong, Nanlu, Yanhe, Shangfu, Xinbeng, Luoxiang, Fenhe, Guanzhong, Qinlong, Longzhong and Nanjiang. Lanyin Mandarin, on the other hand, is divided as Jincheng, Yinwu, Hexi and Beijiang. The Dungan language is a collection of Central Plains Mandarin varieties spoken in the former Soviet Union.
Jin is spoken in most of Shanxi, western Hebei, northern Shaanxi, northern Henan and central Inner Mongolia, often represented by Taiyuannese. It was first proposed as a lect separate from the rest of Mandarin by Li Rong, where it was proposed as lects in and around Shanxi with a checked tone, though this stance is not without disagreement. Jin varieties also often has disyllabic words derived from syllable splitting (分音詞), through the infixation of /(u)əʔ l/.
As per the Language Atlas by Li, Jin is divided into Dabao, Zhanghu, Wutai, Lüliang, Bingzhou, Shangdang, Hanxin, and Zhiyan branches.
Spoken in Yunnan, Guizhou, northern Guangxi, most of Sichuan, southern Gansu and Shaanxi, Chongqing, most of Hubei and bordering parts of Hunan, as well as Kokang of Myanmar and parts of northern Thailand, Southwestern Mandarin speakers take up the most area and population of all Mandarinic language groups, and would be the eighth most spoken language in the world if separated from the rest of Mandarin. Southwestern Mandarinic tends to not have retroflex consonants, and merges all checked tone categories together. With the exception of Minchi, which has a standalone checked category, the checked tone is merged with another category. Representative lects include Wuhannese and Sichuanese, and sometimes Kunmingnese.
Southwestern Mandarin tends to be split as Chuanqian, Xishu, Chuanxi, Yunnan, Huguang and Guiliu branches. Minchi is sometimes separated out as a remnant of Old Shu.
Huai is spoken in central Anhui, northern Jiangxi, far western and eastern Hubei and most of Jiangsu. Due to its preservation of a checked tone, some linguists believe that Huai ought to be treated as a top-level group, like Jin. Representative lects tend to be Nanjingnese, Hefeinese and Yangzhounese. The Huai of Nanjing has likely served as a national prestige during the Ming and Qing periods, though this viewpoint is not supported by all linguists.
The Language Atlas divides Huai into Tongtai, Huangxiao, and Hongchao areas, with the latter further split into Ninglu and Huaiyang. Tongtai, being geographically located furthest west, has the most significant Wu influence, such as in its distribution of historical voiced plosive series.
Yue Chinese is spoken by around 84 million people, in western Guangdong, eastern Guangxi, Hong Kong, Macau and parts of Hainan, as well as overseas communities such as Kuala Lumpur and Vancouver. Famous lects such as Cantonese and Taishanese belong to this family. Yue Chinese lects generally possess long-short distinctions in their vowels, which is reflected in their almost universally split dark checked and often split light checked tones. They generally also tend to preserve all three checked plosive finals and three nasal finals. The status of Pinghua is uncertain, and some believe its two groups, Northern and Southern, should be listed under Yue, though this standpoint is rejected by some.
Yue is generally split into Cantonese (which itself contains Yuehai, Xiangshan, and Guanbao), Siyi, Gaoyang, Qinlian, Wuhua, Goulou (which includes Luoguang), Yongxun and the two Pinghua branches. Siyi is generally agreed to be the most divergent, and Goulou is believed to be the one which is closest related to Pinghua.
Hakka Chinese is a direct result of several migration waves from Northern China to the South, and is spoken in eastern Guangdong, parts of Taiwan, western Fujian, Hong Kong, southern Jiangxi, as well as scattered points in the rest of Guangdong, Hunan, Guangxi and Hainan, along with overseas communities such as in Singkawang, Indonesia, by an estimated total of 44 million people. Some believe that Hakka is closely related to other groups, such as Gan, Yue, or Tongtai. Hakka varieties generally have no voiced plosive initials and preserve the historical ri initial (日母) as an n-like sound.
Hakka can be divided into Yuetai, Hailu, Yuebei, Yuexi, Tingzhou, Ninglong, Yuxin and Tonggui. Meizhounese is often used as the representative variety of Hakka.
Min Chinese is a direct descendant of Old Chinese, and is spoken in Chaoshan and Zhanjiang of Guangdong, Hainan, Taiwan, most of Fujian and parts of Jiangxi and Zhejiang, by around 76 million people. Due to significant amounts of migration, many people in Southeast Asia and Hong Kong are also able of speaking Min varieties. Lects such as Teoswa, Hainanese, Hokkien (incl. Taiwanese) and Hokchiu are all Min varieties.
Due to the fact that Min descended from Old Chinese rather than Middle Chinese, it has some features that would be out of place in other varieties. For instance, some words with the cheng initial (澄母) are not affricates in Min. This, interestingly, has led to many languages, such as Occitan, Inuktitut, Latin, Māori and Telugu, loaning the Sinitic word for tea (茶) with a plosive. Min varieties also have a very large number of words with literary pronunciations.
Min can primarily be split into Coastal and Inland Min varieties. The former contains the Southern Min branches of Quanzhang (Hokkien), Chaoshan (Teoswa), Datian and Zhongshan, the Eastern Min branches of Houguan and Funing, Qionglei Min, as well as Puxian Min, whereas the latter includes Northern, Central and Shaojiang Min. Shaojiang Min acts as a translitional area between Min, Gan, and Hakka.
Wu Chinese is spoken in most of Zhejiang, Shanghai, southern Jiangsu, parts of southern Anhui and eastern Jiangxi by around 82 million people. Many large cities in the Yangtze Delta, such as Suzhou, Changzhou, Ningbo and Hangzhou, use a Wu variety. Wu varieties generally have a fricative initial in their negators, a three-way plosive distinction, as well as a checked coda preserved as a glottal stop, with the exception of Oujiang lects, where it has become vowel length, and Xuanzhou.
Shanghainese, Suzhounese and Wenzhounese are usually used as representatives of Wu. Wu Chinese varieties generally have a massive number of vowels, which rivals even North Germanic languages. The Dondac variety has been observed to have 20 phonemic monophthongal vowels, according to one analysis.
Qian Nairong divides Wu into Taihu (or Northern Wu), Taizhou, Oujiang, Chuqu and Wuzhou. Northern Wu is further divided into Piling, Suhujia, Tiaoxi, Linshao, Yongjiang and Hangzhou, though Hangzhou's classification is unclear.
Huizhou Chinese is spoken in western Hangzhou, southern Anhui and parts of Jingdezhen, by around 5 million people. It is identified as a top level group by the Language Atlas, though some linguists believe in other theories, such as it being a Gan-influenced Wu variety, due to an identifiable basis of Old Wu features. Hui varieties are phonologically diverse, and some features are shared with Wu, such as the simplification of diphthongs. Hui can be divided into Jishe, Xiuyi, Qiwu, Jingzhan and Yanzhou branches, with Tunxinese and Jixinese being representatives.
Gan Chinese is spoken in northern and central Jiangxi, parts of Hebei and Anhui and eastern Hunan, by 22 million people, sometimes believed to be related to Hakka. Gan varieties tend to not palatalize terms with the jian initial (見母) and have an f-like initial in closed xiao and xia initial (合口曉匣兩母) terms, among other features.
Gan can also be divided into Northern and Southern groups. The Northern group was formed during the Tang Dynasty, whereas the Southern group was developed on the basis of Northern Gan. The Language Atlas sees Gan divided into Changdu, Yiliu, Jicha, Fuguang, Yingyi, Datong, Dongsui, Huaiyue and Leizi branches. Nanchangnese is often chosen as the representative. Shaojiang Min is identified to be influenced or even closely related to Fuguang Gan.
Xiang Chinese is spoken in central and western Hunan and nearby parts of Guangxi and Guizhou by an estimated 37 million people. Due to migrations, Xiang can be split into New and Old Xiang groups, with Old Xiang having fewer Mandarin-influenced features. Xiang varieties have universally lost their checked codas, but the majority of them still have a unique preserved checked tone contour. Most also have a three-way plosive distinction, like Wu varieties.
One way of dividing Xiang varieties sees five distinct families, namely Changyi, Hengzhou, Louzhao, Chenxu, and Yongzhou. Changshanese and one of Shuangfengnese or Loudinese are usually taken as Xiang representatives.
The traditional, dialectological classification of Chinese languages is based on the evolution of the sound categories of Middle Chinese. Little comparative work has been done (the usual way of reconstructing the relationships between languages), and little is known about mutual intelligibility. Even within the dialectological classification, details are disputed, such as the establishment in the 1980s of three new top-level groups: Huizhou, Jin and Pinghua, despite the fact that Pinghua is itself a pair of languages and Huizhou may be half a dozen.
Like Bai, the Min languages are commonly thought to have split off directly from Old Chinese. The evidence for this split is that all Sinitic languages apart from the Min group can be fit into the structure of the Qieyun, a 7th-century rime dictionary. However, this view is not universally accepted.
Like many other language families, Sinitic languages have had problems of classification. The following are a few examples.
Traditionally, the lect of urban Hangzhou and New Xiang of eastern Hunan are not considered Mandarin. However, linguists such as Richard VanNess Simmons and Zhou Zhenhe have observed that these two varieties possess more qualifying features of Mandarin languages. For instance, the vowels of the second division of the jia (假) initial is often raised and backed in Wu and Xiang, while they are not in Hangzhounese and New Xiang.
|Traditionally Mandarin||Traditionally Wu||Traditionally Xiang||Gloss|
Note that Nantongnese has heavy Wu influence, which has led to it also having raised and backed vowels.
Danzhounese (儋州話) and Maihua (邁話) are both traditionally considered Yue lects. Recent research, however, has noted that these are both are more likely unclassified. Maihua, for example, may be a Yue-Hakka-Hainanese Min mixed language.
Dongjiang Bendihua (東江本地話) is spoken in and around Huizhou and Heyuan. Its classification has always been unclear, though the most common standpoint is that it is considered Hakka.
The variety spoken in the Ganyu District of Lianyungang (贛榆話) is listed as a variety of Central Plains Mandarin in the Language Atlas of China, though its tonal distribution is more similar to Peninsular Mandarin varieties.
Jerry Norman classified the traditional seven dialect groups into three larger groups: Northern (Mandarin), Central (Wu, Gan, and Xiang) and Southern (Hakka, Yue, and Min). He argued that the Southern Group is derived from a standard used in the Yangtze valley during the Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central group was transitional between the Northern and Southern groups. Some dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined.
Scholars account for the transitional nature of the central varieties in terms of wave models. Iwata argues that innovations have been transmitted from the north across the Huai River to the Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched.
A 2007 study compared fifteen major urban dialects on the objective criteria of lexical similarity and regularity of sound correspondences, and subjective criteria of intelligibility and similarity. Most of these criteria show a top-level split with Northern, New Xiang, and Gan in one group and Min (samples at Fuzhou, Xiamen, Chaozhou), Hakka, and Yue in the other group. The exception was phonological regularity, where the one Gan dialect (Nanchang Gan) was in the Southern group and very close to Meixian Hakka, and the deepest phonological difference was between Wenzhounese (the southernmost Wu dialect) and all other dialects.
The study did not find clear splits within the Northern and Central areas:
The two Wu dialects (Wenzhou and Suzhou) occupied an intermediate position, closer to the Northern/New Xiang/Gan group in lexical similarity and strongly closer in subjective intelligibility but closer to Min/Hakka/Yue in phonological regularity and subjective similarity, except that Wenzhou was farthest from all other dialects in phonological regularity. The two Wu dialects were close to each other in lexical similarity and subjective similarity but not in mutual intelligibility, where Suzhou was actually closer to Northern/Xiang/Gan than to Wenzhou.
In the Southern subgroup, Hakka and Yue grouped closely together on the three lexical and subjective measures but not in phonological regularity. The Min dialects showed high divergence, with Min Fuzhou (Eastern Min) grouped only weakly with the Southern Min dialects of Xiamen and Chaozhou on the two objective criteria and was actually slightly closer to Hakka and Yue on the subjective criteria.
The following section will be dedicated to compare non-Bai Sinitic languages. Though all stem from Old Chinese, they have all developed differences with each other.
Typographically, the vast majority of Sinitic languages use Sinographs. However, some varieties, such as Dungan and Hokkien, have alternative scripts, namely Cyrillic and Latin alphabets. Even between varieties which use Sinographs, characters are repurposed or invented to cover for the difference in vocabulary. Examples include 靚 ("pretty") in Yue, 𠊎 ("I, me") in Hakka, 即 ("this") in Hokkien, 覅 ("to not want") in Wu, 莫 ("do not") in Xiang and 嘎 ("ill-tempered") in Mandarin. Note that both traditional and simplified characters can be used to write any lect.
Phonologically speaking, though all Sinitic languages possess tones, their contours and the total number of tones varies wildly, from Shanghainese, which can be analysed to have only two tones, to Bobainese, which as ten. Sinitic languages also vary wildly in their phonological inventories and phonotactics. Take for instance /mɭɤŋ/ (門兒, "door (diminuitive)") seen in Pingdingnese, or /tʃɦɻʷəi/ (水, "water") of Xuanzhounese, which both show syllables which do not follow the (single) consonant-glide-vowel-consonant syllable structure of more well-known lects. Tone sandhi is also a feature which not all lects share. Cantonese, for instance, only has a very weak system, whereas Wu varieties not only have complex, intricate systems, which affect almost all syllables, but also uses it to mark for grammatical part of speech. Take for instance, this simplified analysis of Suzhounese tone sandhi:
|chain length →
↓ 1st char tone cat
|2 char||3 char||4 char|
|dark level (1)||4 0||4 4 0||4 4 4 0|
|light level (2)||2 3||2 3 0||2 3 4 0|
|rising (3)||5 1||5 1 0||5 1 1 0|
|dark departing (5)||52 3||52 3 0||52 3 4 0|
|light departing (6)||23 1||23 1 0||23 1 1 0|
|chain length →||2 char||3 char||4 char|
|level (1, 2)||dark (7)||4 23||4 23 0||4 23 4 0|
|light (8)||2 3||2 3 0||2 3 4 0|
|rising (3)||dark (7)||5 51||5 51 0||5 51 1 0|
|light (8)||2 51||2 51 0||2 51 1 0|
|departing (5, 6)||dark (7)||5 523||5 52 3||5 52 2 3|
|light (8)||2 523||2 52 3||2 52 2 3|
|checked (7, 8)||dark (7)||4 4||4 4 0||4 4 4 2|
|light (8)||3 4||3 4 0||3 4 2 0|
Disregarding phonology, grammar is the feature of Sinitic languages which differ the most. The majority of Sinitic languages do not possess tenses, though exceptions include Northern Wu lects such as Shanghainese and Suzhounese, though it is largely breaking down in Shanghainese due to Mandarin influence. Sinitic languages generally also have no case marking, though lects such as Linxianese and Hengshannese do possess case particles, with the latter expressing it through tone change. Sinitic languages generally have SVO word order and possess classifiers.
Verb usage may be different between Sinitic languages. Notice the double verb marking seen in lects such as Beijingnese, in these sentences meaning "today I go to Guangzhou":
Sinitic languages tend to vary greatly between how they mark indirect objects. The area which varies tends to be the placement of the indirect and direct object.
Mandarinic, Xiang, Hui and Min languages often place the indirect object (IO) before the direct object (DO). Some lects have switched to IO-DO structure due to Mandarin influence, such as Nanchangnese and Shanghainese, though Shanghainese also has the alternative word order.
On the other hand, Gan, Wu, Hakka, and Yue languages tend to place the DO in front of the IO.
Like other East Asian languages such as Japanese and Korean, Sinitic languages have a system of classifers, however, use of classifiers vary greatly in features such as definiteness. Cantonese, for instance, can be used to mark possession, which is rare in Sinitic while common in Southeast Asia.
個 and 隻 are the most common generic classifiers cross-linguistically. As previously mentioned, Mandarinic languages tend to have fewer classifiers whereas the Southern non-Mandarinic varieties tend to have more.
Sinitic languages can vary greatly in their system of demonstratives. Standard Mandarin and other Northeastern varieties has a two-way system: 這 (zhè, proximal) and 那 (nà, distal), but this is not the only system found in Sinitic languages.
Wuhannese has a neutral demonstrative, which can be used regardless of the distance to the deitic center. Similar systems are found in Northern Wu lects such as Suzhounese and Ningbonese.
In the above sentence, /nɤ³⁵/ can be translated as both "this" and "that". Though Wuhannese has this system of a one-term neutral system, it also has a two-way proximal-distal system. This is same for most other lects with a one-term system.
Even within two-way systems, which is the most common system, terms could have developed to mean the opposite distance from the deitic center. Cantonese 嗰 (go², distal) and Shanghainese 搿 (geq, proximal) are both etymologically from 個, for instance.
Many Sinitic languages have three-way systems, but the three distances are not always the same ones. For instance, whereas Guangshan Mandarin has a person-oriented proximal, medial, distal system, Xinyu Gan has a distance-oriented close, proximal, distal system. Gan especially has many varieties with a three-way system, sometimes even marked with tone and vowel length rather than just changing the term used.
A small number of varieties possess even four- or five-term demonstrative systems. Take for instance the following:
These two lects use tone change and vowel length respectively to distinguish between the four demonstratives.
((cite journal)): Cite journal requires
((cite journal)): Cite journal requires