The word Santali in Ol Chiki script
Native toIndia, Bangladesh, Nepal
Native speakers
7.6 million (2011 census[1])[2]
  • Munda
    • East
      • Kherwarian
        • Santal
          • Santali
  • Mahali (Mahili)
  • Kamari-Santali
  • Khole
  • Lohari-Santali
  • Manjhi
  • Paharia
Official status
Official language in
Language codes
ISO 639-2sat
ISO 639-3Either:
sat – Santali
mjx – Mahali
Glottologsant1410  Santali
maha1291  Mahali
States where Santali is additional official language — Jharkhand and West Bengal
A girl speaking Santali.
Santali books in Mayurbhanj Book Fair

Santali (Pronounced: [santaɽi], Ol Chiki: ᱥᱟᱱᱛᱟᱲᱤ), Bengali: সাঁওতালী, Odia: ସାନ୍ତାଳୀ, Devanagari: संताली, also known as Santal or Santhali, is the most widely-spoken language of the Munda subfamily of the Austroasiatic languages, related to Ho and Mundari, spoken mainly in the Indian states of Assam, Bihar, Jharkhand, Mizoram, Odisha, Tripura and West Bengal[5] by Santals. It is a recognised regional language of India per the Eighth Schedule of the Indian Constitution.[6] It is spoken by around 7.6 million people in India, Bangladesh, Bhutan and Nepal, making it the third most-spoken Austroasiatic language after Vietnamese and Khmer.[5]

Santali was a mainly oral language until developments were made by European missionaries to write it in Bengali, Odia and Roman scripts. Eventually, the Ol Chiki script was developed by Raghunath Murmu in 1925. Ol Chiki is alphabetic, sharing none of the syllabic properties of the other Indic scripts, and is now widely used to write Santali in India.


According to linguist Paul Sidwell, Munda languages probably arrived on the coast of Odisha from Indochina about 4000–3500 years ago, and spread after the Indo-Aryan migration to Odisha.[7]

Until the nineteenth century, Santali had no written language and all shared knowledge was transmitted by word of mouth from generation to generation. European interest in the study of the languages of India led to the first efforts at documenting the Santali language. Bengali, Odia and Roman scripts were first used to write Santali before the 1860s by European anthropologists, folklorists and missionaries including A. R. Campbell, Lars Skrefsrud and Paul Bodding. Their efforts resulted in Santali dictionaries, versions of folk tales, and the study of the morphology, syntax and phonetic structure of the language.

The Ol Chiki script was created for Santali by Mayurbhanj poet Raghunath Murmu in 1925 and first publicised in 1939.[8]

Ol Chiki as a Santali script is widely accepted among Santal communities. Presently in West Bengal, Odisha, and Jharkhand, Ol Chiki is the official script for Santali literature & language.[9][10] However, users from Bangladesh use Bengali script instead.

Santali was honoured in December 2013 when the University Grants Commission of India decided to introduce the language in the National Eligibility Test to allow lecturers to use the language in colleges and universities.[11]

Geographic distribution

Geographic distribution of Santali language by district. Greater shade implies a greater percentage.

The highest concentrations of Santali language speakers are in Santhal Pargana division, as well as East Singhbhum and Seraikela Kharsawan districts of Jharkhand, the Jangalmahals region of West Bengal (Jhargram, Bankura and Purulia districts) and Mayurbhanj district of Odisha.

Smaller pockets of Santali language speakers are found in the northern Chota Nagpur plateau (Hazaribagh, Giridih, Ramgarh, Bokaro and Dhanbad districts), Balesore and Kendujhar districts of Odisha, and throughout western and northern West Bengal (Birbhum, Paschim Medinipur, Hooghly, Paschim Bardhaman, Purba Bardhaman, Malda, Dakshin Dinajpur, Uttar Dinajpur, Jalpaiguri and Darjeeling districts), Banka district and Purnia division of Bihar (Araria, Katihar, Purnia and Kishanganj districts), and tea-garden regions of Assam (Kokrajhar, Sonitpur, Chirang and Udalguri districts). Outside India, the language is spoken in pockets of Rangpur and Rajshahi divisions of northern Bangladesh as well as the Morang and Jhapa districts in the Terai of Province No. 1 in Nepal.[12][13]

Santali is spoken by over seven million people across India, Bangladesh, Bhutan, and Nepal.[5] According to 2011 census, India has a total of 7,368,192 Santali speakers (including 3,58,579 Karmali, 26,399 Mahli).[14][15] State wise distribution is Jharkhand (2.75 million), West Bengal (2.43 million), Odisha (0.86 million), Bihar (0.46 million), Assam (0.21 million) and a few thousand in each of Chhattisgarh, Mizoram, Arunachal Pradesh and Tripura.[16]

Official status

Santali is one of India's 22 scheduled languages.[6] It is also recognised as the additional official language of the states of Jharkhand and West Bengal.[17][18]


Dialects of Santali include Kamari-Santali, Khole, Lohari-Santali, Mahali, Manjhi, Paharia.[5][19][20]



Santali has 21 consonants, not counting the 10 aspirated stops which occur primarily, but not exclusively, in Indo-Aryan loanwords and are given in parentheses in the table below.[21]

  Bilabial Alveolar Retroflex Palatal Velar Glottal
Nasal m n (ɳ)* ɲ ŋ  
Stop voiceless p () t () ʈ (ʈʰ) c () k ()  
voiced b () d () ɖ (ɖʱ) ɟ (ɟʱ) ɡ (ɡʱ)  
Fricative   s       h
Trill/Flap   r ɽ      
Approximant w l   j    
*ɳ only appears as an allophone of /n/ before /ɖ/.

In native words, the opposition between voiceless and voiced stops is neutralised in word-final position. A typical Munda feature is that word-final stops are "checked", i. e. glottalised and unreleased.


Santali has eight oral and six nasal vowel phonemes. With the exception of /e o/, all oral vowels have a nasalized counterpart.

  Front Central Back
High i ĩ   u ũ
Mid-high e ə ə̃ o
Mid-low ɛ ɛ̃   ɔ ɔ̃
Low   a ã  

There are numerous diphthongs.


Santali, like all Munda languages, is a suffixing agglutinating language.


Nouns are inflected for number and case.[22]


Three numbers are distinguished: singular, dual and plural.[23]

Singular ᱥᱮᱛᱟ (seta) 'dog'
Dual ᱥᱮᱛᱟᱼᱠᱤᱱ(seta-ken) 'two dogs'
Plural ᱥᱮᱛᱟᱼᱠᱚ(seta-kɔ) 'dogs'


The case suffix follows the number suffix. The following cases are distinguished:[24]

Case Marker Function
Nominative Subject and object
Genitive ᱼᱨᱮᱱ (animate)
ᱼᱟᱜ, ᱼᱨᱮᱭᱟᱜ (inanimate)
Comitative ᱼᱴᱷᱮᱱ/ -ᱴᱷᱮᱡ Goal, place
Instrumental-Locative ᱼᱛᱮ Instrument, cause, motion
Sociative ᱼᱥᱟᱶ Association
Allative ᱼᱥᱮᱱ/ᱼᱥᱮᱡ Direction
Ablative ᱼᱠᱷᱚᱱ/ᱼᱠᱷᱚᱡ Source, origin
Locative ᱼᱨᱮ Spatio-temporal location

Transcript version:

Case Marker Function
Nominative Subject and object
Genitive -rɛn (animate)
-ak', -rɛak' (inanimate)
Comitative -ʈhɛn/-ʈhɛc' Goal, place
Instrumental-Locative -tɛ Instrument, cause, motion
Sociative -são Association
Allative -sɛn/-sɛc' Direction
Ablative -khɔn/-khɔc' Source, origin
Locative -rɛ Spatio-temporal location


Santali has possessive suffixes which are only used with kinship terms: 1st person , 2nd person -m, 3rd person -t. The suffixes do not distinguish possessor number.[25]


The personal pronouns in Santali distinguish inclusive and exclusive first person and anaphoric and demonstrative third person.[26]

Personal pronouns
Singular Dual Plural
1st person exclusive əliɲ alɛ
inclusive alaŋ abo
2nd person am aben apɛ
3rd person Anaphoric ac' əkin ako
Demonstrative uni unkin onko

The interrogative pronouns have different forms for animate ('who?') and inanimate ('what?'), and referential ('which?') vs. non-referential.[27]

Interrogative pronouns
Animate Inanimate
Referential ɔkɔe oka
Non-referential cele cet'

The indefinite pronouns are:[28]

Indefinite pronouns
  Animate Inanimate
'any' jãheã jãhã
'some' adɔm adɔmak
'another' ɛʈak'ic' ɛʈak'ak'

The demonstratives distinguish three degrees of deixis (proximate, distal, remote) and simple ('this', 'that', etc.) and particular ('just this', 'just that') forms.[29]

Simple Particular
Animate Inanimate Animate Inanimate
Proximate nui noa nii niə
Distal uni ona ini inə
Remote həni hana hini hinə


The basic cardinal numbers (transcribed into Latin script IPA)[30] are:

1 ᱢᱤᱫ mit'
2 ᱵᱟᱨ bar
3 ᱯᱮ
4 ᱯᱩᱱ pon
5 ᱢᱚᱬᱮ mɔ̃ɽɛ̃
6 ᱛᱩᱨᱩᱭ turui
7 ᱮᱭᱟᱭ ɛyae
8 ᱤᱨᱟᱹᱞ irəl
9 ᱟᱨᱮ arɛ
10 ᱜᱮᱞ gɛl
20 ᱤᱥᱤ -isi
100 ᱥᱟᱭ -sae

The numerals are used with numeral classifiers. Distributive numerals are formed by reduplicating the first consonant and vowel, e.g. babar 'two each'.

Numbers basically follow a base-10 pattern. Numbers from 11 to 19 are formed by addition, "gel" ('10') followed by the single-digit number (1 through 9). Multiples of ten are formed by multiplication: the single-digit number (2 through 9) is followed by "gel" ('10'). Some numbers are part of a base-20 number system. 20 can be "bar gel" or "isi".


































ᱯᱮ {} ᱜᱮᱞ {
           } or {
           } (ᱢᱤᱫ) {} ᱤᱥᱤ {} ᱜᱮᱞ

pe {} gel {} or {} (mit’) {} isi {} gel

(3‍ × 10‍) {} or {} ((1‍) × 20‍ + 10‍)



Verbs in Santali inflect for tense, aspect and mood, voice and the person and number of the subject and sometimes of the object.[31]

Subject markers

singular dual plural
1st person exclusive -ɲ(iɲ) -liɲ -lɛ
inclusive -laŋ -bon
2nd person -m -ben -pɛ
3rd person -e -kin -ko

Object markers

Transitive verbs with pronominal objects take infixed object markers.

singular dual plural
1st person exclusive -iɲ- -liɲ- -lɛ-
inclusive -laŋ- -bon-
2nd person -me- -ben- -pɛ-
3rd person -e- -kin- -ko-


Santali is an SOV language, though topics can be fronted.[32]

Influence on other languages

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: "Santali language" – news · newspapers · books · scholar · JSTOR (July 2021) (Learn how and when to remove this message)
This section contains content that is written like an advertisement. Please help improve it by removing promotional content and inappropriate external links, and by adding encyclopedic content written from a neutral point of view. (July 2022) (Learn how and when to remove this message)

Borrowing between Santali and other Indian languages has not yet been studied fully. In modern Indian languages, like Western Hindi, the steps of evolution from Midland Prakrit Sauraseni could be traced clearly. In the case of Bengali such steps of evolution are not always clear and distinct, and one has to look at other influences that moulded Bengali's essential characteristics.[citation needed]

A notable work in this field was initiated by linguist Byomkes Chakrabarti in the 1960s. Chakrabarti investigated the complex process of assimilation of Austroasiatic family, particularly Santali elements, into Bengali. He showed the overwhelming influence of Bengali on Santali. His formulations are based on the detailed study of two-way influences on all aspects of both languages and tried to bring out the unique features of the languages. More research is awaited in this area.[citation needed]

Notable linguist Khudiram Das authored the 'Santali Bangla Samashabda Abhidhan' (সাঁওতালি বাংলা সমশব্দ অভিধান), a book focusing on the influence of the Santali language on Bengali and providing a basis for further research on this subject. 'Bangla Santali Bhasha Samparka (বাংলা সান্তালী ভাষা-সম্পর্ক) is a collection of essays in E-book format authored by him and dedicated to linguist Suniti Kumar Chatterji on the relationship between the Bengali and Santali languages.

See also


  1. ^ "Statement 1: Abstract of speakers' strength of languages and mother tongues – 2011". Office of the Registrar General & Census Commissioner, India. Archived from the original on 16 July 2019. Retrieved 7 July 2018.
  2. ^ Santali at Ethnologue (21st ed., 2018) Closed access icon
    Mahali at Ethnologue (21st ed., 2018) Closed access icon
  3. ^ "P and AR & e-Governance Dept". Retrieved 10 January 2021.
  4. ^ "Redirected". 19 November 2019. Archived from the original on 9 May 2019. Retrieved 9 May 2019.
  5. ^ a b c d Santali at Ethnologue (18th ed., 2015) (subscription required)
    Mahali at Ethnologue (18th ed., 2015) (subscription required)
  6. ^ a b "Distribution of the 22 Scheduled Languages". Census of India. 20 May 2013. Archived from the original on 7 February 2013. Retrieved 26 February 2018.
  7. ^ Sidwell, Paul. 2018. Austroasiatic Studies: state of the art in 2018. Archived 22 May 2018 at the Wayback Machine Presentation at the Graduate Institute of Linguistics, National Tsing Hua University, Taiwan, 22 May 2018.
  8. ^ Hembram, Phatik Chandra (2002). Santhali, a Natural Language. U. Hembram. p. 165.
  9. ^ "Ol Chiki (Ol Cemet', Ol, Santali)". Archived from the original on 27 November 2015. Retrieved 19 March 2015.
  10. ^ "Santali Localization". Archived from the original on 17 March 2016. Retrieved 19 March 2015.
  11. ^ "Syllabus for UGC NET Santali, Dec 2013" (PDF). Archived (PDF) from the original on 6 November 2018. Retrieved 4 January 2020.
  12. ^ "Santhali". Ethnologue. Archived from the original on 25 May 2020. Retrieved 4 January 2020.
  13. ^ "Santhali becomes India's first tribal language to get own Wikipedia edition". Hindustan Times. 9 August 2018. Archived from the original on 22 February 2019. Retrieved 22 February 2019.
  14. ^ "SCHEDULED LANGUAGES IN DESCENDING ORDER OF SPEAKERS' STRENGTH - 2011" (PDF). Archived (PDF) from the original on 9 October 2022. Retrieved 17 December 2019.
  15. ^ "ABSTRACT OF SPEAKERS' STRENGTH OF LANGUAGES AND MOTHER TONGUES - 2011" (PDF). Archived (PDF) from the original on 14 November 2018. Retrieved 17 December 2019.
  16. ^ "PART-A: DISTRIBUTION OF THE 22 SCHEDULED LANGUAGES-INDIA/STATES/UNION TERRITORIES - 2011 CENSUS" (PDF). Archived (PDF) from the original on 15 April 2022. Retrieved 17 December 2019.
  17. ^ "Second language". India Today. 22 October 2011. Archived from the original on 14 February 2022. Retrieved 5 November 2019.
  18. ^ Roy, Anirban (27 May 2011). "West Bengal to have six more languages for official use". India Today. Archived from the original on 6 March 2023. Retrieved 5 November 2019.
  19. ^ "Glottolog 3.2 – Santali". Archived from the original on 9 July 2018. Retrieved 26 February 2018.
  20. ^ "Santali: Paharia language". Global recordings network. Archived from the original on 3 December 2018. Retrieved 26 February 2018.
  21. ^ Anderson, Gregory D.S. (2007). The Munda verb: typological perspectives. Berlin: Mouton de Gruyter.
  22. ^ Ghosh (2008), p. 32.
  23. ^ Ghosh (2008), pp. 32–33.
  24. ^ Ghosh (2008), pp. 34–38.
  25. ^ Ghosh (2008), p. 38.
  26. ^ Ghosh (2008), p. 41.
  27. ^ Ghosh (2008), p. 43.
  28. ^ Ghosh (2008), p. 44.
  29. ^ Ghosh (2008), p. 45.
  30. ^ "Santali". The Department of Linguistics, Max Planck Institute (Leipzig, Germany). 2001. Archived from the original on 1 December 2017. Retrieved 27 November 2017.
  31. ^ Ghosh (2008), p. 53ff..
  32. ^ Ghosh (2008), p. 74.

Works cited

  • Ghosh, Arun (2008). "Santali". In Anderson, Gregory D.S. (ed.). The Munda Languages. London: Routledge. pp. 11–98.

Further reading


Grammars and primers