The Austroasiatic languages include Vietnamese and Khmer, as well as many other languages spoken in areas scattered as far afield as Malaya (Aslian) and central India (Korku), often in isolated pockets surrounded by the ranges of other language groups. Most linguists believe that Austroasiatic languages once ranged continuously across southeast Asia and that their scattered distribution today is the result of the subsequent arrival of other language groups.
One of these groups were the Tai–Kadai languages such as Thai, Lao and Shan. These languages were originally spoken in southern China, where the greatest diversity within the family is still found, and possibly as far north as the Yangtze valley. As Chinese civilization expanded southward from the North China Plain, many Tai–Kadai speakers became sinicized, while others were displaced to Southeast Asia. With the exception of Zhuang, most of the Tai–Kadai languages still remaining in China are spoken in isolated upland areas.
The Miao–Yao or Hmong–Mien languages also originated in southern China, where they are now spoken only in isolated hill regions. Many Hmong–Mien speakers were displaced into Southeast Asia during the Qing Dynasty in the 18th and 19th centuries, triggered by the suppression of a series of revolts in Guizhou.
The Austronesian languages are believed to have spread from Taiwan to the islands of the Indian and Pacific Oceans, as well as some areas of mainland southeast Asia.
To the north are the Turkic, Mongolic and Tungusic language families, which some linguists had grouped as an Altaic family, sometimes also including the Korean and Japonic languages, but is now seen as a discredited theory and is no longer supported by specialists in these languages. The languages tend to be atonal, polysyllabic and agglutinative, with subject–object–verb word order and some degree of vowel harmony. Critics of the Altaic hypothesis attribute the similarities to intense language contact between the languages that occurred sometime in pre-history.
Chinese scholars often group Tai–Kadai and Hmong–Mien with Sino-Tibetan, but Western scholarship since the Second World War has considered them as separate families. Some larger groupings have been proposed, but are not widely supported. The Austric hypothesis, based on morphology and other resemblances, is that Austroasiatic, Austronesian, often Tai–Kadai, and sometimes Hmong–Mien form a genetic family. Other hypothetical groupings include the Sino-Austronesian languages and Austro-Tai languages. Linguists undergoing long-range comparison have hypothesized even larger macrofamilies such as Dené–Caucasian, including Sino-Tibetan and Ket.
The Mainland Southeast Asia linguistic area stretches from Thailand to China and is home to speakers of languages of the Sino-Tibetan, Hmong–Mien (or Miao–Yao), Tai-Kadai, Austronesian (represented by Chamic) and Austroasiatic families. Neighbouring languages across these families, though presumed unrelated, often have similar typological features, which are believed to have spread by diffusion.
Characteristic of many MSEA languages is a particular syllable structure involving monosyllabicmorphemes, lexical tone, a fairly large inventory of consonants, including phonemic aspiration, limited clusters at the beginning of a syllable, plentiful vowel contrasts and relatively few final consonants. Languages in the northern part of the area generally have fewer vowel and final contrasts but more initial contrasts.
A well-known feature is the similar tone systems in Chinese, Hmong–Mien, Tai languages and Vietnamese. Most of these languages passed through an earlier stage with three tones on most syllables (apart from checked syllables ending in a stop consonant), which was followed by a tone split where the distinction between voiced and voiceless consonants disappeared but in compensation the number of tones doubled.
These parallels led to confusion over the classification of these languages, until Haudricourt showed in 1954 that tone was not an invariant feature, by demonstrating that Vietnamese tones corresponded to certain final consonants in other languages of the Mon–Khmer family, and proposed that tone in the other languages had a similar origin.
MSEA languages tend to have monosyllabic morphemes, though there are exceptions. Most MSEA languages are very analytic, with no inflection and little derivational morphology. Grammatical relations are typically signalled by word order, particles and coverbs or adpositions. Modality is expressed using sentence-final particles. The usual word order in MSEA languages is subject–verb–object. Chinese and Karen are thought to have changed to this order from the subject–object–verb order retained by most other Sino-Tibetan languages.
The order of constituents within a noun phrase varies: noun–modifier order is usual in Tai languages, Vietnamese and Miao, while in Chinese varieties and Yao most modifiers are placed before the noun.Topic-comment organization is also common.
Languages of both eastern and southeast Asia typically have well-developed systems of numeral classifiers. The other areas of the world where numerical classifier systems are common in indigenous languages are the western parts of North and South America, so that numerical classifiers could even be seen as a pan-Pacific Rim areal feature. However, similar noun class systems are also found among most Sub-Saharan African languages.
Today, these words of Chinese origin may be written in the traditional Chinese characters (Chinese, Japanese, and Korean), simplified Chinese characters (Chinese, Japanese), a locally developed phonetic script (Korean hangul, Japanese kana), or a Latin alphabet (Vietnamese). The Chinese, Japanese, Korean and Vietnamese languages are collectively referred to as CJKV, or just CJK, since modern Vietnamese is no longer written with Chinese characters at all.
In a similar way to the use of Latin and ancient Greek roots in English, the morphemes of Classical Chinese have been used extensively in all these languages to coin compound words for new concepts. These coinages, written in shared Chinese characters, have then been borrowed freely between languages. They have even been accepted into Chinese, a language usually resistant to loanwords, because their foreign origin was hidden by their written form.
In topic–comment constructions, sentences are frequently structured with a topic as the first segment and a comment as the second. This way of marking previously mentioned vs. newly introduced information is an alternative to articles, which are not found in East Asian languages. The Topic–comment sentence structure is a legacy of Classical Chinese influence on the grammar of modern East Asian languages. In Classical Chinese, the focus of the phrase (i.e. the topic) was often placed first, which was then followed by a statement about the topic. The most generic sentence form in Classical Chinese is "A B 也", where B is a comment about the topic A.
Linguistic systems of politeness, including frequent use of honorific titles, with varying levels of politeness or respect, are well-developed in Japanese and Korean. Politeness systems in Chinese are relatively weak, having simplified from a more developed system into a much less predominant role in modern Chinese. This is especially true when speaking of the southern Chinese varieties. However, Vietnamese has retained a highly complex system of pronouns, in which the terms mostly derive from Chinese. For example, bác, chú, dượng, and cậu are all terms ultimately derived from Chinese and all refer to different statuses of "uncle".
In many of the region's languages, including Japanese, Korean, Thai, and Malay/Indonesian, new personal pronouns or forms of reference or address can and often do evolve from nouns as fresh ways of expressing respect or social status. Thus personal pronouns are open class words rather than closed class words: they are not stable over time, not few in number, and not clitics whose use is obligatory in grammatical constructs. In addition to Korean honorifics that indicate politeness toward the subject of the speech, Korean speech levels indicate a level of politeness and familiarity directed toward the audience.
With modernization and other trends, politeness language is evolving to be simpler. Avoiding the need for complex polite language can also motivate use in some situations of languages like Indonesian or English that have less complex respect systems.
^"While 'Altaic' is repeated in encyclopedias and handbooks most specialists in these languages no longer believe that the three traditional supposed Altaic groups, Turkic, Mongolian and Tungusic, are related." Lyle Campbell & Mauricio J. Mixco, A Glossary of Historical Linguistics (2007, University of Utah Press), pg. 7.