EthnicitySinitic peoples
China, Taiwan, Singapore, Christmas Island, Chinese diaspora communities, Ferghana Valley (Uzbekistan) and Chu Valley (Kyrgyzstan) (Dungans), Russian Far East (Tazs)
Linguistic classificationSino-Tibetan
  • Sinitic
ISO 639-5zhx
Glottologsini1245  (Sinitic)
macr1275  (Macro-Bai)
Map of Sinitic languages in China

The Sinitic languages[a] (simplified Chinese: 汉语族; traditional Chinese: 漢語族; pinyin: Hànyǔ zú), often synonymous with the Chinese languages, are a group of East Asian analytic languages that constitute a major branch of the Sino-Tibetan language family. It is frequently proposed that there is a primary split between the Sinitic languages and the rest of the family (the Tibeto-Burman languages). This view is rejected by a number of researchers[4] but has found phylogenetic support among others.[5][6] The Macro-Bai languages, whose classification is difficult, may be an offshoot of Old Chinese and thus Sinitic;[7] otherwise Sinitic is defined only by the many varieties of Chinese unified by a shared historical background, and usage of the term "Sinitic" may reflect the linguistic view that Chinese constitutes a family of distinct languages, rather than variants of a single language.[b]


Over 91% of the Chinese population speaks a Sinitic language.[9] Approximately 1.52 billion people are speakers of the Chinese macrolanguage, of whom about three-quarters speak a Mandarin variety. Estimates of the number of global speakers of Sinitic branches as of 2018–19, both native and non-native, are listed below:[10]

Branch Speakers pct.
Mandarin 1,118,584,040 73.50%
Yue 85,576,570 5.62%
Wu 81,817,790 5.38%
Min 75,633,810 4.97%
Jin 47,100,000 3.09%
Hakka 44,065,190 2.90%
Xiang 37,400,000 2.46%
Gan 22,200,000 1.46%
Huizhou 5,380,000 0.35%
Pinghua 4,130,000 0.27%
Dungan 56,300 0.004%
Total 1,521,943,700 100%


Further information: List of varieties of Chinese

L1 speakers of Chinese and other Sino-Tibetan languages according to Ethnologue

Dialectologist Jerry Norman estimated that there are hundreds of mutually unintelligible Sinitic languages.[11] They form a dialect continuum in which differences generally become more pronounced as distances increase, though there are also some sharp boundaries.[12] The Sinitic languages can be divided into Macro-Bai languages and Chinese languages, and the following is one of many potential ways of subdividing these languages. Some varieties, such as Shaozhou Tuhua, are hard to classify, and thus are not included in the following briefs.

Macro-Bai languages

This is a language family first proposed by linguist Zhengzhang Shangfang,[13] and was expanded to include Longjia and Luren.[14][15] It likely split off from the rest of Sinitic during the Old Chinese period.[16] The languages included are all considered minority languages in China and are spoken in the Southwest.[17][18] The languages are:

All other Sinitic languages henceforth would be considered Chinese.


The Chinese branch of the family is classified into at least seven main families. These families are classified based on five main evolutionary criteria:[9]

  1. The evolution of the historical fully muddy (全浊; 全濁; quánzhuó) initials
  2. The distribution of rimes across the four tone qualities, as conditioned by voicing and aspiration of initials
  3. The evolution of the checked (; ) tone category
  4. The loss or retention of coda position plosives and nasals
  5. The palatalisation of the jiàn initial (見母; jiànmǔ) in front of high vowels

The varieties within one family may not be mutually intelligible with each other. For instance, Wenzhounese and Ningbonese are not highly mutually intelligible. The Language Atlas of China identifies ten groups:[19]

with Jin, Hui, Pinghua, and Tuhua not part of the seven traditional groups.


Varieties of Mandarin are used in the Western Regions, the Southwest, Huguang, Inner Mongolia, Central Plains and the Northeast,[19] by around three-quarters of the Sinitic-speaking population.[10] Historically, the prestige variety has always been Mandarin, which is still reflected to this day in Standard Chinese.[20] In fact, Standard Chinese is now an official language of the Republic of China, People's Republic of China, Singapore and United Nations.[9] Re-population efforts, such as that of the Qing dynasty in the Southwest, tended to involve Mandarin speakers.[21] Classification of Mandarin lects has undergone several significant changes, though nowadays it is commonly divided as such, based on the distribution of the historical checked tone:[19]

as well as other lects, which do not neatly fall into these categories, such as Mandarin Junhua varieties.

Varieties of Mandarin can be defined by their universally lost -m final, low number of tones, and smaller inventory of classifiers, among other features. Mandarin lects also often have rhotic erhua rimes, though the amount of its use may vary between lects.[9] Loss of checked tone is an often cited criterion for Mandarin languages, though lects such as Yangzhounese and Taiyuannese show otherwise.

Mandarin Non-Mandarin Gloss
Beijing Jinan Zhengzhou Xi'an Taiyuan Chengdu Nanjing Guangzhou Meizhou Xiamen Anyi
in iẽ iən iẽ iəŋ in in iɐm im im im 'sound'
ɕin ɕiẽ siən ɕiẽ ɕiəŋ ɕin sin sɐm sim sim ɕim 'heart'

Northeastern and Beijing Mandarin

Northeastern Mandarin is spoken in Heilongjiang, Jilin, most of Liaoning and northeastern Inner Mongolia, whereas Beijing Mandarin is spoken in northern Hebei, most of Beijing, parts of Tianjin and Inner Mongolia.[19] The two families' most notable features are the heavy use of rhotic erhua and seemingly random distribution of the dark checked tone, and generally having four tones with the contours of high flat, rising, dipping, and falling.

Tone contour of historically dark checked tone (陰入) characters
Northeastern/Beijing Other Gloss
Harbin Changchun Shenyang Beijing Heyuan Chaozhou Suzhou Hefei Wuhan
213 53 213 53 5 21 55 5 213 'guest'
44 44 33 55 5 21 55 5 213 'eight'
213 213 213 213 5 21 55 5 213 'north'

Northeastern Mandarin, especially in Heilongjiang, contains many loanwords from Russian.[24]

Term Pronunciation Meaning Origin
卜留克 bǔliúkè 'rutabaga' брюква bryukva
馬神 mǎshén 'machine' машина mashina
巴籬子 bālízi 'jail' полиция politsiya

Northeastern Mandarin lects can be divided into three main groups, namely Hafu (including Harbinnese and Changchunnese), Jishen (including Jilinnese and Shenyangnese), and Heisong. Notably, the extinct Taz language of Russia is also a Northeastern Mandarin language. Beijing is sometimes included in Northeastern Mandarin due to its distribution of the historical dark checked tone,[22][23] though is listed as its own group by others, often due to its more regular light checked tones.[19]

Jilu Mandarin

Jilu Mandarin is spoken in southern Hebei and western Shandong,[19] and is often represented with Jinannese.[25] Notable cities that use Jilu Mandarin lects include Cangzhou, Shijiazhuang, Jinan and Baoding.[26][27] Characteristically Jilu Mandarin features include merging the dark checked into the dark level tone, the light checked into light level or departing based on the manner of articulation of the initial, and vowel breaking in tong rime series' (通攝) checked-tone words, among other features.

Jilu Mandarin can be classified into Baotang, Shiji, Canghui and Zhangli.[28] Zhangli is of note due to its preservation of a separate checked tone.

Jiaoliao Mandarin

Distribution of Jiaoliao Mandarin varieties

Jiaoliao Mandarin is spoken in the Jiaodong and Liaodong Peninsulae, which includes the cities of Dalian and Qingdao, as well as several prefectures along the China-Korea border.[19] Like Jilu Mandarin, its light checked tone is merged into light level or departing based on the manner of articulation of the initial, though its dark checked is merged into the rising. Its initial (日母) terms are pronounced with a null initial (apart from open zhǐ rime series (止攝開口) finals), unlike the /ʐ/ of Northern and Beijing Mandarin.[29]

Based on, for example, the pronunciation of the palatalized jiàn initial (見母),[19] Jiaoliao Mandarin can be divided into Qingzhou, Denglian and Gaihuan areas.[28]

Yantai Weihai Qingdao Dalian Gloss
ciau ciau tɕiɔ tɕiɔ 'to hand in'
cian cian tɕiã tɕiɛ̃ 'to see'

Central Plains and Lanyin Mandarin

Central Plains Mandarin is spoken in the Central Plains of Henan, southwestern Shanxi, southern Shandong and northern Jiangsu, as well as most of Shaanxi, southern Ningxia and Gansu and southern Xinjiang, in famous cities such as Kaifeng, Zhengzhou, Luoyang, Xuzhou, Xi'an, Xining and Lanzhou.[30][31][32] Central Plains Mandarin lects merge the historical checked tones with a lesser muddy (次濁) and clear () initial together with the rising tone, and those with a fully muddy (全濁) initial are merged with the light level tone.[19]

Lanyin Mandarin, spoken in northern Ningxia, parts of Gansu and northern Xinjiang, is sometimes grouped together with Central Plains Mandarin due to its merged lesser light and dark checked tones, though it is realised as a departing tone.

Subdivision of Central Plains Mandarin is not fully agreed upon, though one possible subdivision sees 13 divisions, namely Xuhuai, Zhengkai, Luosong, Nanlu, Yanhe, Shangfu, Xinbeng, Luoxiang, Fenhe, Guanzhong, Qinlong, Longzhong and Nanjiang.[33] Lanyin Mandarin, on the other hand, is divided as Jincheng, Yinwu, Hexi and Beijiang. The Dungan language is a collection of Central Plains Mandarin varieties spoken in the former Soviet Union.


Distribution of Jin varieties

Jin is spoken in most of Shanxi, western Hebei, northern Shaanxi, northern Henan and central Inner Mongolia,[19] often represented by Taiyuannese.[25] It was first proposed as a lect separate from the rest of Mandarin by Li Rong, where it was proposed as lects in and around Shanxi with a checked tone, though this stance is not without disagreement.[34][35] Jin varieties also often has disyllabic words derived from syllable splitting (分音詞), through the infixation of /(u)əʔ l/.[9]





笨 {} 薄 愣

pəŋ꜄ → pəʔ꜇ ləŋ꜄






滾 {} 骨 攏

꜂kʊŋ → kuəʔ꜆ ꜂lʊŋ

'to roll'

As per the Language Atlas by Li, Jin is divided into Dabao, Zhanghu, Wutai, Lüliang, Bingzhou, Shangdang, Hanxin, and Zhiyan branches.[19]

Southwestern Mandarin

Spoken in Yunnan, Guizhou, northern Guangxi, most of Sichuan, southern Gansu and Shaanxi, Chongqing, most of Hubei and bordering parts of Hunan, as well as Kokang of Myanmar and parts of northern Thailand, Southwestern Mandarin speakers take up the most area and population of all Mandarinic language groups, and would be the eighth most spoken language in the world if separated from the rest of Mandarin.[19] Southwestern Mandarinic tends to not have retroflex consonants, and merges all checked tone categories together. With the exception of Minchi, which has a standalone checked category, the checked tone is merged with another category. Representative lects include Wuhannese and Sichuanese, and sometimes Kunmingnese.[25]

Southwestern Mandarin tends to be split as Chuanqian, Xishu, Chuanxi, Yunnan, Huguang and Guiliu branches. Minchi is sometimes separated out as a remnant of Old Shu.[36]


Distribution of Huai varieties

Huai is spoken in central Anhui, northern Jiangxi, far western and eastern Hubei and most of Jiangsu.[19] Due to its preservation of a checked tone, some linguists believe that Huai ought to be treated as a top-level group, like Jin. Representative lects tend to be Nanjingnese, Hefeinese and Yangzhounese.[25] The Huai of Nanjing has likely served as a national prestige during the Ming and Qing periods,[37] though this viewpoint is not supported by all linguists.[38]

The Language Atlas divides Huai into Tongtai, Huangxiao, and Hongchao areas, with the latter further split into Ninglu and Huaiyang. Tongtai, being geographically located furthest west, has the most significant Wu influence, such as in its distribution of historical voiced plosive series.[19][39][40]

Tongtai Non-Tongtai
Nantong Taizhou Yangzhou Hangzhou Fuzhou Huizhou
tʰi tʰi ti di tei ti
pʰeŋ pʰiŋ pin biŋ paŋ piaŋ


Distribution of Yue varieties (including Pinghua)

Yue Chinese is spoken by around 84 million people,[10] in western Guangdong, eastern Guangxi, Hong Kong, Macau and parts of Hainan, as well as overseas communities such as Kuala Lumpur and Vancouver.[19] Famous lects such as Cantonese and Taishanese belong to this family.[9] Yue Chinese lects generally possess long-short distinctions in their vowels, which is reflected in their almost universally split dark checked and often split light checked tones. They generally also tend to preserve all three checked plosive finals and three nasal finals. The status of Pinghua is uncertain, and some believe its two groups, Northern and Southern, should be listed under Yue,[41] though this standpoint is rejected by some.[19]

Checked tone contours in Yue lects
Tone Dark Light
Short Long Short Long
Guangzhou 55 33 22
Hong Kong 55 33 22
Dongguan 44 224 22
Shiqi 5 3
Taishan 55 33 21
Bobai 55 33 22
Yulin 5 3 2 21

Yue is generally split into Cantonese (which itself contains Yuehai, Xiangshan, and Guanbao), Siyi, Gaoyang, Qinlian, Wuhua, Goulou (which includes Luoguang), Yongxun and the two Pinghua branches.[19] Siyi is generally agreed to be the most divergent, and Goulou is believed to be the one which is closest related to Pinghua.[41]


Hakka Chinese is a direct result of several migration waves from Northern China to the South,[42] and is spoken in eastern Guangdong, parts of Taiwan, western Fujian, Hong Kong, southern Jiangxi, as well as scattered points in the rest of Guangdong, Hunan, Guangxi and Hainan, along with overseas communities such as in West Kalimantan and Bangka Belitung Islands in Indonesia, by an estimated total of 44 million people.[19][10] Some believe that Hakka is closely related to other groups, such as Gan, Yue, or Tongtai.[43][44][45] Hakka varieties generally have no voiced plosive initials and preserve the historical initial (日母) as an n-like sound.[19][46]

Realization of the historical initial in Hakka
Meizhou Changting Hsinchu Hong Kong Yudu
ȵin neŋ ȵin ŋɡin niẽ
ȵit ni ȵit ŋɡit nie

Hakka can be divided into Yuetai, Hailu, Yuebei, Yuexi, Tingzhou, Ninglong, Yuxin and Tonggui.[19] Meizhounese is often used as the representative variety of Hakka.[25]


Distribution of Min varieties in mainland China, Hainan and Taiwan

Min Chinese is a direct descendant of Old Chinese, and is spoken in Chaoshan and Zhanjiang of Guangdong, Hainan, Taiwan, most of Fujian and parts of Jiangxi and Zhejiang, by around 76 million people.[10] Due to significant amounts of migration, many people in Southeast Asia and Hong Kong are also able of speaking Min varieties. Lects such as Teoswa, Hainanese, Hokkien (incl. Taiwanese) and Hokchiu are all Min varieties.[19]

Due to the fact that Min descended from Old Chinese rather than Middle Chinese, it has some features that would be out of place in other varieties. For instance, some words with the cheng initial (澄母) are not affricates in Min. This, interestingly, has led to many languages, such as Occitan, Inuktitut, Latin, Māori and Telugu, loaning the Sinitic word for 'tea' () with a plosive. Min varieties also have a very large number of words with literary pronunciations.[9]

Selection of reflexes of the cheng initial
Min Non-Min
Fuzhou Quanzhou Chaozhou Putian Jian'ou Haikou Leizhou Lanzhou Guiyang Changsha
ta te te ta ʔdɛ te tʂʰa tsʰa tsa
tiŋ tan tʰiŋ tɛŋ teiŋ ʔdaŋ taŋ tʂʰən tsʰən tsən

Min can primarily be split into Coastal and Inland Min varieties. The former contains the Southern Min branches of Quanzhang (Hokkien), Chaoshan (Teoswa), Datian and Zhongshan, the Eastern Min branches of Houguan and Funing, Qionglei Min, as well as Puxian Min, whereas the latter includes Northern, Central and Shaojiang Min. Shaojiang Min acts as a translitional area between Min, Gan, and Hakka.[20][34]


Distribution of Wu varieties

Wu Chinese is spoken in most of Zhejiang, Shanghai, southern Jiangsu, parts of southern Anhui and eastern Jiangxi by around 82 million people.[19][10][47] Many large cities in the Yangtze Delta, such as Suzhou, Changzhou, Ningbo and Hangzhou, use a Wu variety. Wu varieties generally have a fricative initial in their negators, a three-way plosive distinction, as well as a checked coda preserved as a glottal stop, with the exception of Oujiang lects, where it has become vowel length, and Xuanzhou.[47][40]

An example of a tripartite division of plosives
Shanghai Suzhou Changzhou Shaoxing Ningbo Taizhou Wenzhou Jinhua Lishui Quzhou
tʰoŋ tʰoŋ tʰoŋ tʰoŋ tʰoŋ tʰoŋ tʰoŋ tʰoŋ tʰɔŋ tʰaŋ
toŋ toŋ toŋ toŋ toŋ toŋ toŋ toŋ tɔŋ taŋ
doŋ doŋ doŋ doŋ doŋ doŋ doŋ doŋ dɔŋ daŋ

Shanghainese, Suzhounese and Wenzhounese are usually used as representatives of Wu.[25] Wu Chinese varieties generally have a massive number of vowels, which rivals even North Germanic languages.[48][49] The Dondac variety has been observed to have 20 phonemic monophthongal vowels, according to one analysis.[50]

Qian Nairong divides Wu into Taihu (or Northern Wu), Taizhou, Oujiang, Chuqu and Wuzhou. Northern Wu is further divided into Piling, Suhujia, Tiaoxi, Linshao, Yongjiang and Hangzhou, though Hangzhou's classification is unclear.[40][47]


Huizhou Chinese is spoken in western Hangzhou, southern Anhui and parts of Jingdezhen, by around 5 million people.[19][10] It is identified as a top level group by the Language Atlas, though some linguists believe in other theories, such as it being a Gan-influenced Wu variety, due to an identifiable basis of Old Wu features.[9][51][52][53] Hui varieties are phonologically diverse, and some features are shared with Wu, such as the simplification of diphthongs.[54] Hui can be divided into Jishe, Xiuyi, Qiwu, Jingzhan and Yanzhou branches, with Tunxinese and Jixinese being representatives.


Gan Chinese is spoken in northern and central Jiangxi, parts of Hebei and Anhui and eastern Hunan, by 22 million people,[19][10] sometimes believed to be related to Hakka.[43][44] Gan varieties tend to not palatalize terms with the jian initial (見母) and have an f-like initial in closed xiao and xia initial (合口曉匣兩母) terms, among other features.[55]

Pronunciation of terms with a xia or xiao initial and closed medial in Gan
Nanchang Yichun Ji'an Fuzhou Yingtan
ϕɨi fi fei fai fɛi
ϕu fu fu fu fu

Gan can also be divided into Northern and Southern groups. The Northern group was formed during the Tang dynasty, whereas the Southern group was developed on the basis of Northern Gan.[9] The Language Atlas sees Gan divided into Changdu, Yiliu, Jicha, Fuguang, Yingyi, Datong, Dongsui, Huaiyue and Leizi branches.[19] Nanchangnese is often chosen as the representative.[25] Shaojiang Min is identified to be influenced or even closely related to Fuguang Gan.[56]


Distribution of Xiang varieties in Hunan and Guangxi

Xiang Chinese is spoken in central and western Hunan and nearby parts of Guangxi and Guizhou by an estimated 37 million people.[19][10] Due to migrations, Xiang can be split into New and Old Xiang groups, with Old Xiang having fewer Mandarin-influenced features.[57][9] Xiang varieties have universally lost their checked codas, but the majority of them still have a unique preserved checked tone contour. Most also have a three-way plosive distinction, like Wu varieties.[19]

One way of dividing Xiang varieties sees five distinct families, namely Changyi, Hengzhou, Louzhao, Chenxu, and Yongzhou.[58] Changshanese and one of Shuangfengnese or Loudinese are usually taken as Xiang representatives.[25]

Internal classification

After applying the linguistic comparative method to the database of comparative linguistic data developed by Laurent Sagart in 2019 to identify sound correspondences and establish cognates, phylogenetic methods are used to infer relationships among these languages and estimate the age of their origin and homeland.[59]

The traditional, dialectological classification of Chinese languages is based on the evolution of the sound categories of Middle Chinese. Little comparative work has been done (the usual way of reconstructing the relationships between languages), and little is known about mutual intelligibility. Even within the dialectological classification, details are disputed, such as the establishment in the 1980s of three new top-level groups: Huizhou, Jin and Pinghua, despite the fact that Pinghua is itself a pair of languages and Huizhou may be half a dozen.[60][61]

Like Bai, the Min languages are commonly thought to have split off directly from Old Chinese.[62] The evidence for this split is that all Sinitic languages apart from the Min group can be fit into the structure of the Qieyun, a 7th-century rime dictionary.[63] However, this view is not universally accepted.

Points of contention

Like many other language families, Sinitic languages have had problems of classification. The following are a few examples.

Southern China

Traditionally, the lect of urban Hangzhou and New Xiang of eastern Hunan are not considered Mandarin.[19] However, linguists such as Richard VanNess Simmons and Zhou Zhenhe have observed that these two varieties possess more qualifying features of Mandarin languages.[40][64] For instance, the vowels of the second division of the jia () initial is often raised and backed in Wu and Xiang, while they are not in Hangzhounese and New Xiang.

Traditionally Mandarin Traditionally Wu Traditionally Xiang Gloss
Beijing Nanjing Nantong Shanghai Suzhou Wenzhou Hangzhou Changsha Shuangfeng
xua xuɑ xuo ho ho kʰo hua fa xo 'flower'
kua kuɑ kuo ko ko ko kua kua ko 'melon'
ɕia ɕiɑ xo ɦo ɦo ɦo ia xa ɣo 'down'

Nantongnese has heavy Wu influence, which has led to it also having raised and backed vowels.

Danzhounese and Maihua are both traditionally considered Yue lects.[19] Recent research, however, has noted that these are both are more likely unclassified.[65] Maihua, for example, may be a Yue-Hakka-Hainanese Min mixed language.[66]

Dongjiang Bendihua (東江本地話) is spoken in and around Huizhou and Heyuan. Its classification has always been unclear, though the most common standpoint is that it is considered Hakka.[19][67]

Northern China

The variety spoken in the Ganyu District of Lianyungang (贛榆話) is listed as a variety of Central Plains Mandarin in the Language Atlas of China,[19] though its tonal distribution is more similar to Peninsular Mandarin varieties.[68]

Relationships between groups

Jerry Norman classified the traditional seven dialect groups into three larger groups: Northern (Mandarin), Central (Wu, Gan, and Xiang) and Southern (Hakka, Yue, and Min). He argued that the Southern Group is derived from a standard used in the Yangtze valley during the Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central group was transitional between the Northern and Southern groups.[69] Some dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined.[12]

Scholars account for the transitional nature of the central varieties in terms of wave models. Iwata argues that innovations have been transmitted from the north across the Huai River to the Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched.[70]

A quantitative study

A 2007 study compared fifteen major urban dialects on the objective criteria of lexical similarity and regularity of sound correspondences, and subjective criteria of intelligibility and similarity. Most of these criteria show a top-level split with Northern, New Xiang, and Gan in one group and Min (samples at Fuzhou, Xiamen, Chaozhou), Hakka, and Yue in the other group. The exception was phonological regularity, where the one Gan dialect (Nanchang Gan) was in the Southern group and very close to Meixian Hakka, and the deepest phonological difference was between Wenzhounese (the southernmost Wu dialect) and all other dialects.[71]

The study did not find clear splits within the Northern and Central areas:[71]

The two Wu dialects (Wenzhou and Suzhou) occupied an intermediate position, closer to the Northern/New Xiang/Gan group in lexical similarity and strongly closer in subjective intelligibility but closer to Min/Hakka/Yue in phonological regularity and subjective similarity, except that Wenzhou was farthest from all other dialects in phonological regularity. The two Wu dialects were close to each other in lexical similarity and subjective similarity but not in mutual intelligibility, where Suzhou was actually closer to Northern/Xiang/Gan than to Wenzhou.[71]

In the Southern subgroup, Hakka and Yue grouped closely together on the three lexical and subjective measures but not in phonological regularity. The Min dialects showed high divergence, with Min Fuzhou (Eastern Min) grouped only weakly with the Southern Min dialects of Xiamen and Chaozhou on the two objective criteria and was actually slightly closer to Hakka and Yue on the subjective criteria.[71]

Internal comparison

The following section will be dedicated to compare non-Bai and non-Cai–Long Sinitic languages. Though all stem from Old Chinese, they have all developed differences with each other.

Writing system

POJ inscription
An example of Hokkien written exclusively in the Latin alphabet.

Typographically, the vast majority of Sinitic languages use Sinographs. However, some varieties, such as Dungan and Hokkien, have alternative scripts, namely Cyrillic and Latin alphabets. Even between varieties which use Sinographs, characters are repurposed or invented to cover for the difference in vocabulary. Examples include ; 'pretty' in Yue,[72] 𠊎; 'I', 'me' in Hakka,[46] ; 'this' in Hokkien,[73] ; 'to not want' in Wu,[48] ; 'do not' in Xiang, and ; 'ill-tempered' in Mandarin.[74][24] Note that both traditional and simplified characters can be used to write any lect.


Phonologically speaking, though all Sinitic languages possess tones, their contours and the total number of tones varies wildly, from Shanghainese, which can be analysed to have only two tones,[48] to Bobainese, which has ten.[75] Sinitic languages also vary wildly in their phonological inventories and phonotactics. Take for instance /mɭɤŋ/ (門兒; 'door (diminuitive)') seen in Pingdingnese,[20] or /tʃɦɻʷəi/ (; 'water') of Xuanzhounese,[76] which both show syllables which do not follow the (single) consonant-glide-vowel-consonant syllable structure of more well-known lects. Tone sandhi is also a feature which not all lects share. Cantonese, for instance, only has a very weak system,[77] whereas Wu varieties not only have complex, intricate systems, which affect almost all syllables, but also uses it to mark for grammatical part of speech.[48][49] Take for instance, this simplified analysis of Suzhounese tone sandhi:[78]

Unchecked Tone Sandhi
chain length →
↓ 1st char tone cat
2 char 3 char 4 char
dark level (1) 4 0 4 4 0 4 4 4 0
light level (2) 2 3 2 3 0 2 3 4 0
rising (3) 5 1 5 1 0 5 1 1 0
dark departing (5) 52 3 52 3 0 52 3 4 0
light departing (6) 23 1 23 1 0 23 1 1 0
Checked tone sandhi
chain length → 2 char 3 char 4 char
2nd char
tone cat
1st char
level (1, 2) dark (7) 4 23 4 23 0 4 23 4 0
light (8) 2 3 2 3 0 2 3 4 0
rising (3) dark (7) 5 51 5 51 0 5 51 1 0
light (8) 2 51 2 51 0 2 51 1 0
departing (5, 6) dark (7) 5 523 5 52 3 5 52 2 3
light (8) 2 523 2 52 3 2 52 2 3
checked (7, 8) dark (7) 4 4 4 4 0 4 4 4 2
light (8) 3 4 3 4 0 3 4 2 0


Disregarding phonology, grammar is the feature of Sinitic languages which differ the most. The majority of Sinitic languages do not possess tenses, though exceptions include Northern Wu lects such as Shanghainese and Suzhounese, though it is largely breaking down in Shanghainese due to Mandarin influence.[49][79] Sinitic languages generally also have no case marking, though lects such as Linxianese and Hengshannese do possess case particles, with the latter expressing it through tone change.[80][81] Sinitic languages generally have SVO word order and possess classifiers.

Verb usage may be different between Sinitic languages. Notice the double verb marking seen in lects such as Beijingese, in these sentences meaning "today I go to Guangzhou":[82]


今 天

Jīn tiān





廣 州

Guǎng zhōu






{今 天} 我 到 {廣 州} 去

{Jīn tiān} wǒ dào {Guǎng zhōu} qù

today 1sg arrive Guangzhou go


今 阿

cin1 - a1




廣 州

kuaon3 - cieu1







{今 阿} 我 {廣 州} 去

{cin1 - a1} ngeu4 {kuaon3 - cieu1} chi5

today 1sg Guangzhou go

Indirect object marking

Sinitic languages tend to vary greatly between how they mark indirect objects. The area which varies tends to be the placement of the indirect and direct object.[9][20]

Mandarinic, Xiang, Hui and Min languages often place the indirect object (IO) before the direct object (DO). Some lects have switched to IO-DO structure due to Mandarin influence, such as Nanchangese and Shanghainese, though Shanghainese also has the alternative word order.

On the other hand, Gan, Wu, Hakka, and Yue languages tend to place the DO in front of the IO.


Like other East Asian languages such as Japanese and Korean, Sinitic languages have a system of classifers, however, use of classifiers vary greatly in features such as definiteness.[20] In Cantonese, for instance, they can be used to mark possession, which is rare in Sinitic while common in Southeast Asia.[9]







我 本 書

ngo5 bun2 syu1

1SG CL book

'my book'

and are the most common generic classifiers cross-linguistically.[9] As previously mentioned, Mandarinic languages tend to have fewer classifiers whereas the Southern non-Mandarinic varieties tend to have more.[20]


Sinitic languages can vary greatly in their system of demonstratives.[20] Standard Mandarin and other Northeastern varieties has a two-way system: ; zhè (proximal) and ; (distal), but this is not the only system found in Sinitic languages.

Wuhannese has a neutral demonstrative, which can be used regardless of the distance to the deitic center.[84][85] Similar systems are found in Northern Wu lects such as Suzhounese and Ningbonese.[49][20]





















[c] 是 生 的 , [c] 是 熟 的

nɤ35 sɿ35 sən55 ti {} nɤ35 sɿ35 səu213 ti

DEM COP unripe P {} DEM COP ripe P

In the above sentence, /nɤ³⁵/ can be translated as both 'this' and 'that'. Though Wuhannese has this system of a one-term neutral system, it also has a two-way proximal-distal system. This is same for most other lects with a one-term system.

Even within two-way systems, which is the most common system, terms could have developed to mean the opposite distance from the deitic center. Cantonese ; go² (distal) and Shanghainese ; geq (proximal) are both etymologically from , for instance.[72][48]

Many Sinitic languages have three-way systems, but the three distances are not always the same ones. For instance, whereas Guangshan Mandarin has a person-oriented proximal, medial, distal system, Xinyu Gan has a distance-oriented close, proximal, distal system. Gan especially has many varieties with a three-way system, sometimes even marked with tone and vowel length rather than just changing the term used.[20][86]

A small number of varieties possess even four- or five-term demonstrative systems. Take for instance the following:[20]

Dongxiang Zhangshu
Close ꜀ko kọ꜆
Proximal ꜁ko ko꜆
Distal ꜀e ꜃hɛ
Yonder ꜁e ꜃hɛ̣

These two lects use tone change and vowel length respectively to distinguish between the four demonstratives.


  1. ^ From Late Latin Sīnae, "the Chinese", probably from Arabic Ṣīn ('China'), from the Chinese dynastic name Qin. (OED). In 1982, Paul K. Benedict proposed a subgroup of Sino-Tibetan called "Sinitic" comprising Bai and Chinese.[1] The precise affiliation of Bai remains uncertain[2] and the term "Sinitic" is usually used as a synonym for Chinese, especially when viewed as a language family rather than as a language.[3]
  2. ^ See, for example, Enfield (2003:69) and Hannas (1997). The Chinese terms often translated as 'language' and 'dialect' do not correspond well to those translations. These are 語言; yǔyán, corresponding to macrolanguage or language cluster, which is used for Chinese itself; 方言; fāngyán, which separates mutually unintelligible languages within a yǔyán; and 土語; tǔyǔ or 土話; tǔhuà, which corresponds better to the familiar Western linguistic use of 'dialect'.[8]
  3. ^ a b This term was not assigned a character.



  1. ^ Wang (2005), p. 107.
  2. ^ Wang (2005), p. 122.
  3. ^ Mair (1991), p. 3.
  4. ^ van Driem (2001), p. 351.
  5. ^ Zhang, Menghan; Yan, Shi; Pan, Wuyun; Jin, Li (2019). "Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic". Nature. 569 (7754): 112–115. Bibcode:2019Natur.569..112Z. doi:10.1038/s41586-019-1153-z. ISSN 1476-4687. PMID 31019300. S2CID 129946000.
  6. ^ Sagart et al. (2019).
  7. ^ van Driem (2001:403) states "Bái ... may form a constituent of Sinitic, albeit one heavily influenced by Lolo–Burmese."
  8. ^ Bradley (2012), p. 1.
  9. ^ a b c d e f g h i j k l m Chan, Sin-Wai; Chappell, Hilary; Li, Lan (2017). Routledge Encyclopedia of the Chinese language: Mandarin and other Sinitic languages. Oxford: Routledge. pp. 605–628.
  10. ^ a b c d e f g h i "Chinese".
  11. ^ Norman (2003), p. 72.
  12. ^ a b Norman (1988), pp. 189–190.
  13. ^ Zhengzhang, Shangfang (2010). "蔡家话白语关系及词根比较". 研究之乐 (2). Shanghai: Shanghai Educational Publishing House: 389–400.
  14. ^ 貴州省民族識別工作隊語言組 (1984). 蔡家的語言.
  15. ^ 貴州省民族識別工作隊 (1984). 南龍人(南京-龍家)族別問題調查報告.
  16. ^ Gong, Xun (6 November 2015). "How Old is the Chinese in Bái?". Paris. ((cite journal)): Cite journal requires |journal= (help)
  17. ^ 貴州省志 民族志. Guiyang: 貴州民族出版社. 2002.
  18. ^ Xu, Lin; Zhao, Yansun (1984). 白语简志. 民族印刷廠.
  19. ^ a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae Li, Rong (2012). 中國語言地圖集.
  20. ^ a b c d e f g h i j k Chappell, Hilary M. (2015). Diversity in Sinitic Languages. Oxford University Press. ISBN 9780198723790.
  21. ^ Tsung, Linda (2014). Language Power and Hierarchy: Multilingual Education in China. Bloomsbury Publishing.
  22. ^ a b Lin, Tao (1987). "北京官话区的划分". 方言 (3): 166–172. ISSN 0257-0203.
  23. ^ a b Zhang, Shifang (2010). 北京官话语音研究. Beijing Language and Culture University Press. ISBN 978-7-5619-2775-5.
  24. ^ a b Yin, Shichao (1997). 哈爾濱方言詞典. 江蘇教育出版社.
  25. ^ a b c d e f g h 北京大學中國語言文學系 (1995). 漢語方言詞彙. 语文出版社.
  26. ^ Wu, Jizhang; Tang, Jianxiong; Chen, Shujing (2005). 河北省志 方言志. 方志出版社.
  27. ^ Qian, Zengyi (2002). "山東方言研究" (3). ((cite journal)): Cite journal requires |journal= (help)
  28. ^ a b Qian, Zengyi (2010). 漢語官話方言研究. 齊魯書社.
  29. ^ Luo, Futeng (1997). 牟平方言詞典. 江蘇教育出版社.
  30. ^ He, Wei (June 1993). 洛陽方言研究. 社會科學文獻出版社.
  31. ^ Su, Xiaoqing; Lü, Yongwei (December 1996). 徐州方言詞典. 江蘇教育出版社. ISBN 7534328837.
  32. ^ Zhang, Chengcai (December 1994). 西寧方言詞典. 江蘇教育出版社. ISBN 7534322936.
  33. ^ He, Wei. 中原官話分區. Beijing: 中國社會科學院語言研究所.
  34. ^ a b Hou, Jing (2002). 現代漢語方言概論. 上海教育出版社. p. 46.
  35. ^ Wang, Futang (1998). 漢語方言語音的演變和層次. Beijing: 語文研究.
  36. ^ Zhou, Jixu (2012). "南路話和湖廣話的語音特點". 語言研究 (3).
  37. ^ 漢語方言學大詞典. 廣東教育出版社. 2017. p. 150. ISBN 9787554816332.
  38. ^ Zeng, Xiaoyu (2014). "《西儒耳目資》音系基礎非南京方言補證". 語言科學 (4).
  39. ^ Tao, Guoliang. 南通方言詞典. Nanjing: 江蘇人民出版社.
  40. ^ a b c d Richard VanNess Simmons (1999). Chinese Dialect Classification: A comparative approach to Harngjou, Old Jintarn, and Common Northern Wu. John Benjamins Publishing Co.
  41. ^ a b Lin, Yi (2016). "廣西的粵方言". 欽州學院學報. 31 (6): 38–42.
  42. ^ "The Hakka People > Historical Background". Archived from the original on 2019-09-09. Retrieved 2010-06-11.
  43. ^ a b Peng, Xinyi (2010). 江西客贛語的特殊音韻現象與結構變遷. 國立中興大學中國文學研究所.
  44. ^ a b Lu, Guoyao (2003). 魯國堯語言學論文集·客、贛、通泰方言源於南朝通語說. 江蘇教育出版社. pp. 123–135. ISBN 7534354994.
  45. ^ Sagart, Lawrence (March 2011). Chinese dialects classified on shared innovations.
  46. ^ a b Huang, Xuezhen (December 1995). 梅縣方言詞典. 江蘇教育出版社. ISBN 7534325064.
  47. ^ a b c Qian, Nairong (1992). 當代吳語研究. 上海教育出版社.
  48. ^ a b c d e Qian, Nairong (2007). 上海話大詞典. 上海教育出版社.
  49. ^ a b c d Ye, Changling (1993). 蘇州方言詞典. 江蘇教育出版社.
  50. ^ "奉贤金汇学校首开"偒傣话"课(图)". 人民網. Archived from the original on 2022-07-22. Retrieved 2022-07-22.
  51. ^ Li, Rulong (2001). 漢語方言學. Beijing: 高等教育出版社. p. 17.
  52. ^ Zhengzhang, Shangfang (1986). "皖南方言的分區(稿)". 方言 (1).
  53. ^ Zhang, Guangyu (1999). "東南方言關係總論". 方言 (1).
  54. ^ Meng, Qinghui (2005). 徽州方言. Beijing: 安徽人民出版社.
  55. ^ Sun, Yizhi; Chen, Changyi; Xu, Yangchun (2001). 江西贛方言語音的特點.
  56. ^ Chen, Zhangtai. 閩語研究.
  57. ^ Song, Diwu; Cao, Shuji. 中國移民史 第五卷:名師其.
  58. ^ Bao, Houxing; Chen, Hui (2005). 湘語的分區(稿).
  59. ^ Sagart et al. (2019), pp. 10319–10320.
  60. ^ Kurpaska (2010), pp. 41–53, 55–56.
  61. ^ Yan (2006), pp. 9–18, 61–69, 222.
  62. ^ Mei (1970), p. ?.
  63. ^ Pulleyblank (1984), p. 3.
  64. ^ Zhou, Zhenhe; You, Rujie (1986). Fāngyán yǔ zhōngguó wénhuà 方言与中国文化 [Dialects and Chinese culture]. Shanghai Renmin Chubanshe.
  65. ^ Kurpaska (2010), p. 73.
  66. ^ Jiang, Ouyang & Zou (2007)
  67. ^ Liu, Ruoyun (1991). 惠州方言志.
  68. ^ Liu, Chuanxian (2001). 赣榆方言志. Beijing: 中华书局.
  69. ^ Norman (1988), pp. 182–183.
  70. ^ Iwata (2010), pp. 102–108.
  71. ^ a b c d Tang & Van Heuven (2007), p. 1025.
  72. ^ a b Bai, Wanru (1998). 廣州方言詞典. 江蘇教育出版社出版. ISBN 9787534334344.
  73. ^ Li, Rong (1993). 廈門方言詞典. 江蘇教育出版社出版. ISBN 9787534319952.
  74. ^ Bao, Houxing (December 1998). 長沙方言詞典. 江蘇教育出版社出版. ISBN 9787534319983.
  75. ^ Xie, Jianyou (2007). 廣西漢語方言研究. 廣西人民出版社.
  76. ^ Shen, Ming (2016). 安徽宣城(雁翅)方言. 中國社會科學出版社.
  77. ^ Zheng, Ding'ou (1997). 香港粵語詞典. 江蘇教育出版社. ISBN 9787534329425.
  78. ^ Wang, Ping (August 1996). 蘇州方言語音研究. 華中理工大學出版社. ISBN 7560911315.
  79. ^ Qian, Nairong (錢乃榮) (2010). 《從〈滬語便商〉所見的老上海話時態》 (Tenses and Aspects? Old Shanghainese as Found in the Book Huyu Bian Shang). Shanghai: The Chinese University of Hong Kong Press.
  80. ^ Zhang, Qiang (2021). "臨夏方言格標記「哈[XA⁴³]」探究". 淮南師範學院學報. 23 (2). Guangzhou.
  81. ^ Liu, Juan; Peng, Zerun (July 2019). "衡山方言人稱代詞領格變調現象的實質". 湘潭大學學報(哲學社會科學版). 43 (4).
  82. ^ Liu, Danqing (2001). 吳語的句法類型特點.
  83. ^ Lau, Chun-Fat (November 2021). 香港客家話研究. Hong Kong: 中華教育. ISBN 9789888760046.
  84. ^ Zhu, Jiansong (1992). 武漢方言研究.
  85. ^ Zhu, Jiansong (May 1995). 武漢方言詞典. 江蘇教育出版社. ISBN 7534323290.
  86. ^ Wei, Gangqiang (1995). 黎川方言詞典. 江蘇教育出版社.

Works cited