Tamil phonology is characterised by the presence of “true-subapical” retroflex consonants and multiple rhotic consonants. Its script does not distinguish between voiced and unvoiced consonants; phonetically, voice is assigned depending on a consonant's position in a word, voiced intervocalically and after nasals except when geminated. Tamil phonology permits few consonant clusters, which can never be word initial.
The vowels are called உயிரெழுத்து uyireḻuttu ('life letter'). The vowels are classified into short and long (five of each type) and two diphthongs.
The long (nedil) vowels are about twice as long as the short (kuṟil) vowels. The diphthongs are usually pronounced about 1.5 times as long as the short vowels, though most grammatical texts place them with the long vowels.
|Close||i இ||iː ஈ||u/ɯ உ||uː ஊ|
|Mid||e எ||eː ஏ||o ஒ||oː ஓ|
|Open||ä அ||äː ஆ|
Tamil has two diphthongs: /aɪ/ ஐ and /aʊ/ ஔ, the latter of which is restricted to a few lexical items. Some like Krishnamurti consideres the diphthongs as clusters of /a/ + /j, ʋ/ as they pattern with other VC. The way some words are written also varies e.g. avvai as அவ்வை (avvai), ஔவை (auvai) or அவ்வய் (avvay) (first one most common). Word final /u/ is pronounced as [ɯ~ɨ] and it has been unrounded for so long that speakers unround it even in LT; in ST it can occur medially as well in some words after the first syllable. Word final [u] occurs in some names, chiefly male nicknames like rājēndran̠ as rāju.
Colloquially, an initial /i(:)/ or /e(:)/ may have a [ʲ] onglide; likewise, an initial /o(:)/ or /u(:)/ may have a [ʷ] onglide, e.g. [ʲeɾi] and [ʷoɾɯ]. This does not occur in Sri Lankan dialects.
Colloquial Tamil also has nasalized vowels formed from word final vowel + nasal cluster (except for /Vɳ/ where an epenthetic u is added after it). Long vowel + nasal just nasalizes the vowel, short vowel + nasal may also change the quality, for example, /an/ gets fronted to [ɛ̃] அவன் /aʋan/ becomes [aʋɛ̃] ([aʋæ̃] for some speakers), /am/ gets rounded to [õ] மரம் /maɾam/ becomes [maɾõ], நீங்களும் /n̪iːnkaɭum/ becomes [n̪iːŋɡaɭũ], வந்தான் /ʋant̪a:n/ becomes [ʋan̪d̪ã:], the remaining vowels only get nasalized.
In spoken Tamil sometimes an epenthetic vowel u is added to words ending in consonants, e.g. nil > nillu, āḷ > āḷu, nāḷ > nāḷu (nā in some dialects), vayal > vayalu etc. If another word is joined at the end, it is deleted.
Colloquially, the high short vowels /i/, /u/ when next to a short consonant and /a, aɪ/ are lowered to [e] and [o]. For example, இடம் /iʈam/ becomes [eɖam]; and உடம்பு /uʈampu/ becomes [oɖambɯ]. It doesnt happen in pronouns and some other words e.g. இவன் ivaṉ and எவன் evaṉ are different words. /aɪ/ also monophthongises to an /e/ but it causes the lowering of /i, u/ before it, e.g. ilai > ele. Additionally, the front long vowels /eː/ and /iː/ are subject to retraction when present in the first syllable of a bisyllabic word and followed by a retroflex consonant. As such, /ʋiːʈu/ "house" becomes [ʋɨːɖɪ̈], but its inflected form /ʋiːʈʈukku/ remains [ʋiːʈ(ː)uk(ː)ɪ̈]. Likewise, /t̪eːʈu/ "search!" becomes [t̪əːɖɪ̈], but /t̪eːʈinaːn/ "(he) searched" remains [t̪eːɖinãː]. The presence and degree of retraction for each vowel may be different; it varies between dialects and even individual speakers. Almost all words end with vowels in ST.
For some speakers in ST the front vowels /i(:), e(:)/ get rounded to their corresponding rounded back vowels when they are after a labial consonant /m, p, ʋ/ and before a retroflex consonant, some words with it are quite acceptable like பெண் /peɳ/ > பொண்/பொண்ணு [poɳ~poɳ:ɯ] but others like வீடு /ʋi:ʈu/ > வூடு [ʋu:ɖɯ] are less accepted and may even be considered vulgar.
Some other changes in ST are vowel harmony where vowels change their height to be more similar to nearby vowels like in LT. /koʈu/ > ST. [kuɖɯ].
The consonants are known as மெய்யெழுத்து meyyeḻuttu ('body letters'). The consonants are classified into three categories with six in each category: vallinam ('hard'), mellinam ('soft' or nasal), and idayinam ('medium'). Tamil has very restricted consonant clusters (for example, there are no word-initial clusters) and has allophonic aspirated stops. There are well defined rules for voicing stops in the written form of Tamil, Centamil (the period of Tamil history before Sanskrit words were borrowed). Stops are voiceless when at the start of a word, in a consonant cluster with another stop and when geminated. They are voiced otherwise.
Tamil is characterized by its use of more than one type of coronal consonants: like many of the other languages of India, it contains a series of retroflex consonants. Notably, the Tamil retroflex series includes the retroflex approximant /ɻ/ (ழ) (example Tamil; often transcribed 'zh'), which is rare in the Indo-Aryan languages. Among the other Dravidian languages, the retroflex approximant also occurs in Malayalam (for example in 'Kozhikode') and Badaga, disappeared from spoken Kannada around 1000 AD (although the character is still written, and exists in Unicode, ೞ as in ಕೊೞೆ), and was never present in Telugu. In some dialects of colloquial Tamil, this consonant is seen as disappearing and shifting to the retroflex lateral approximant /ɭ/ in the south and palatal approximant /j/ in the north.
The proto-Dravidian alveolar stop *ṯ developed into an alveolar trill /r/ in the Southern Dravidian languages while *ṯṯ and *ṉṯ remained (modern ṯṟ, ṉṟ).
[n] and [n̪] are in complementary distribution and are predictable, [n̪] word initially and before /t̪/ and [n] elsewhere, i.e. they are allophonic.
/ɲ/ is extremely rare word initially and is only found before /t͡ɕ/ word medially. [ŋ] only occurs before /k/.
A chart of the Tamil consonant phonemes in the International Phonetic Alphabet follows:
|Nasal||m ம்||(n̪) ந்||n ன்||ɳ ண்||ɲ ஞ்||(ŋ) ங்|
|Stop/Affricate||p ப்||t̪ த்||tːr ற்ற||ʈ ட்||t͡ɕ ~ t͡ʃ ச்1||k க்|
|Fricative||(f)1||s5 ஸ் (z)1||(ʂ)1 ஷ்||(ɕ)1 ஶ்||(x)2||(h)2 ஹ்|
|Approximant||ʋ வ்||ɻ ழ்||j ய்|
|Lateral approximant||l ல்||ɭ ள்|
The voiceless consonants are voiced in different positions.
In modern Tamil, however, voiced plosives occur initially in loanwords. Geminate stops get simplified to singleton unvoiced stops after long vowels, suggesting the primary cue is now voicing (cf. kūṭṭam-kūṭam becoming kūṭam-kūḍam in modern speakers). Altogether, we see a shift in progress towards phonemic voicing, more advanced in some dialects than others.
Classical Tamil had a phoneme called the āytam, written as ‘ஃ'. Tamil grammarians of the time classified it as a dependent phoneme (or restricted phoneme) (cārpeḻuttu), but it is very rare in modern Tamil. The rules of pronunciation given in the Tolkāppiyam, a text on the grammar of Classical Tamil, suggest that the āytam could have glottalised the sounds it was combined with. It has also been suggested that the āytam was used to represent the voiced implosive (or closing part or the first half) of geminated voiced plosives inside a word. The āytam, in modern Tamil, is also used to convert p to f when writing English words and a few other sound using the Tamil script.
Unlike Indo-Aryan languages spoken around it, Tamil does not have distinct letters for aspirated consonants and they are found as allophones of the normal stops. The Tamil script also lacks distinct letters for voiced and unvoiced stops as their pronunciations depend on their location in a word. For example, the voiceless stop [p] occurs at the beginning of words while the voiced stop [b] cannot. In the middle of words, voiceless stops commonly occur as a geminated pair like -pp-, while voiced stops do not. Only voiced stops can appear medially and after a corresponding nasal. Thus both the voiced and voiceless stops can be represented by the same script in Tamil without ambiguity, the script denoting only the place and broad manner of articulation (stop, nasal, etc.). The Tolkāppiyam cites detailed rules as to when a letter is to be pronounced with voice and when it is to be pronounced unvoiced. The only exceptions to these rules are the letters ச and ற as they are pronounced medially as [s] and [r] respectively.
Some loan words are pronounced in Tamil as they were in the source language, even if this means that consonants which should be unvoiced according to the Tolkāppiyam are voiced.
Elision is the reduction in the duration of sound of a phoneme when preceded by or followed by certain other sounds. There are well-defined rules for elision in Tamil. They are categorised into different classes based on the phoneme which undergoes elision.
|1.||Kutr iyal ukaram (short nature U)||the vowel u|
|2.||Kutr iyal ikaram (short nature I)||the vowel i|
|3.||Aiykaara k kurukkam ( AI shortening)||the diphthong ai|
|4.||Oukaara k kurukkam ( AU shortening)||the diphthong au|
|5.||Aaytha k kurukkam ( h shortening)||the special character akh (aaytham)|
|6.||Makara k kurukkam ( M shortening)||the phoneme m|
1. Kutr iyal ukaram refers to the vowel /u/ turning into the close back unrounded vowel [ɯ] at the end of words (e.g.: ‘ஆறு’ (meaning ‘six’) will be pronounced [aːrɯ]).
2. Kutr iyal ikaram refers to the shortening of the vowel /i/ before the consonant /j/.
The following text is Article 1 of the Universal Declaration of Human Rights.
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
மனிதப் பிறிவியினர் சகலரும் சுதந்திரமாகவே பிறக்கின்றனர்; அவர்கள் மதிப்பிலும், உரிமைகளிலும் சமமானவர்கள், அவர்கள் நியாயத்தையும் மனச்சாட்சியையும் இயற்பண்பாகப் பெற்றவர்கள். அவர்கள் ஒருவருடனொருவர் சகோதர உணர்வுப் பாங்கில் நடந்துகொள்ளல் வேண்டும்.
maṉitap piṟiviyiṉar cakalarum cutantiramākavē piṟakkiṉṟaṉar; avarkaḷ matippilum, urimaikaḷilum camamāṉavarkaḷ, avarkaḷ niyāyattaiyum maṉaccāṭciyaiyum iyaṟpaṇpākap peṟṟavarkaḷ. Avarkaḷ oruvaruṭaṉoruvar cakōtara uṇarvup pāṅkil naṭantukoḷḷal vēṇṭum.
/manit̪ap‿piriʋijinaɾ sakalaɾum sut̪ant̪iɾamaːkaʋeː pirakkinranaɾ ǀ aʋaɾkaɭ mat̪ippilum uɾimai̯kaɭilum samamaːnaʋaɾkaɭ aʋaɾkaɭ nijaːjat̪t̪ai̯jum manat͡ʃt͡ʃaːʈt͡ʃijum ijarpaɳpaːkap petːraʋaɾkaɭ ǁ aʋaɾkaɭ oɾuʋaɾuʈanoɾuʋaɾ sakoːt̪aɾa uɳaɾʋup‿paːnkil naʈant̪ukoɭɭal ʋeːɳʈum/