Shom Peng
RegionGreat Nicobar Island
EthnicityShompen people
Native speakers
400 (2004)[1]
Possibly a language isolate traditionally considered Austroasiatic
  • Kalay (west)
  • Keyet (east)
Language codes
ISO 639-3sii
ELPShom Peng
Approximate location where Shompen is spoken
Approximate location where Shompen is spoken
Location in the Andaman and Nicobar Islands and in the Bay of Bengal.
Approximate location where Shompen is spoken
Approximate location where Shompen is spoken
Shompen (Bay of Bengal)
Coordinates: 7°01′N 93°49′E / 7.02°N 93.81°E / 7.02; 93.81
This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters. For an introductory guide on IPA symbols, see Help:IPA.

Shompen, or Shom Peng is a language or group of languages spoken on Great Nicobar Island in the Indian union territory of the Andaman and Nicobar Islands, in the Indian Ocean, northwest of Sumatra, Indonesia.

Partially because the native peoples of the Andaman and Nicobar Islands are protected from outside researchers, Shompen is poorly described, with most descriptions being from the 19th century and a few more recently but of poor quality. Shompen appears to be related to the other Southern Nicobarese varieties, however Glottolog considers it a language isolate.


The Shompen are hunter-gatherers living in the hilly hinterland of the Great Nicobar Biosphere Reserve. Population estimates are approximately 400, but no census has been conducted.

Parmanand Lal (1977:104)[2] reported the presence of several Shompen villages in the interior of Great Nicobar Island.


During the 20th century, the only data available were a short word list in De Roepstorff (1875),[3] scattered notes Man (1886)[4] and comparative list in Man (1889).[5]

It was a century before more data became available, with 70 words being published in 1995[6] and much new data being published in 2003, the most extensive so far.[7] However, Blench and Sidwell (2011) note that the 2003 book is at least partially plagiarized and that the authors show little sign of understanding the material, which is full of anomalies and inconsistencies. For example, [a] is transcribed as short ⟨a⟩ but schwa [ə] as long ⟨ā⟩, the opposite of normal conventions in India or elsewhere. It appears to have been taken from an earlier source or sources, perhaps from the colonial era.[8] Van Driem (2008) found it too difficult to work with,[9] However, Blench and Sidwell made an attempt at analyzing and retranscribing the data, based on comparisons of Malay loanwords and identifiable cognates with other Austroasiatic languages, and concluded that the data in the 1995 and 2003 publications come from either the same language or two closely related languages.


Although Shompen is traditionally lumped in with other Nicobarese languages, which form a branch of the Austroasiatic languages, there was little evidence to support this assumption during the 20th century. Man (1886) notes that there are very few Shompen words that "bear any resemblance" to Nicobarese and also that "in most instances", words differ between the two Shompen groups with which he worked. For example, the word for "back (of the body)" is given as gikau, tamnōi, and hokōa in different sources; "to bathe" as pu(g)oihoɔp and hōhōm; and "head" as koi and fiāu. In some of these cases, that may be a matter of borrowed versus native vocabulary, as koi appears to be Nicobarese, but it also suggests that Shompen is not a single language.

Based on the 1997 data, however, van Driem (2008) concluded that Shompen was a Nicobarese language.[9]

Blench and Sidwell note many cognates with both Nicobarese and with Jahaic in the 2003 data, including many words found only in Nicobarese or only in Jahaic (or sometimes also in Senoic), and they also note that Shompen shares historical phonological developments with Jahaic. Given the likelihood of borrowing from Nicobarese, that suggests that Shompen might be a Jahaic or at least Aslian language, or perhaps a third branch of a Southern Austroasiatic family alongside Aslian and Nicobarese.[8]

However, Paul Sidwell (2017)[10] classifies Shompen as a Southern Nicobaric language, rather than a separate branch of Austroasiatic.


It is not clear if the following description applies to all varieties of Shompen or how phonemic it is.

Eight vowel qualities are recovered from the transcription, /i e ɛ a ə ɔ o u/, which may be nasalized and or lengthened. There are numerous vowel sequences and diphthongs.

The consonants are attested as follows:

Bilabial Alveolar Palatal Velar Glottal
plain aspirated plain aspirated plain aspirated
Stop voiceless p t c k ʔ
voiced b d ɟ g
Fricative voiceless ɸ x h
voiced ɣ
Nasal m n ɲ ŋ
Approximant l j w

Many Austroasiatic roots with final nasal stops, *m *n *ŋ, appear in Shompen with voiced oral stops [b d ɡ], which resembles Aslian and especially Jahaic, whose historical final nasals have become prestopped or fully oral. Although Jahaic nasal stops conflated with oral stops, Shompen oral stops appear to have been lost first, only to be reacquired as nasals became oral. There are also, however, certainly numerous words that retain final nasal stops. It is not clear if borrowing from Nicobarese is enough to explain all of those exceptions. Shompen could have been partially relexified under the influence of Nicobarese, or consultants might have given Nicobarese words during elicitation.

Other historical sound changes are word-final *r and *l shifting to [w], *r before a vowel shifting to [j], the deletion of final *h and *s, and the breaking of Austroasiatic long vowels into diphthongs.


There is no standard way to write the Shompen language.




Word Shompen Southern Nicobarese proto-Nicobarese
hot dai(d) tait *taɲ
four fuat fôat *foan
child köˑat kōˑan *kuːn
lip tūˑin paṅ-nōˑin *manuːɲ
dog kab âm *ʔam
night tahap hatòm *hatəːm
male akòit (otāˑha) *koːɲ
ear nâng nâng *naŋ
one heng heg *hiaŋ
belly (kàu) wīˑang *ʔac
sun hok-ngīˑa hēg -


  1. ^ Shompen at Ethnologue (18th ed., 2015) (subscription required)
  2. ^ Lal, Parmanand. 1977. Great Nicobar Island: study in human ecology. Calcutta: Anthropological Survey of India, Govt. of India.
  3. ^ De Roëpstorff, 1875. Vocabulary of dialects spoken in the Nicobar and Andaman islands. 2nd ed. Calcutta.
  4. ^ EH Man, 1886. "A Brief Account of the Nicobar Islanders, with Special Reference to the Inland Tribe of Great Nicobar." The Journal of the Anthropological Institute of Great Britain and Ireland, 15:428–451.
  5. ^ EH Man, 1889. A dictionary of the Central Nicobarese language. London: W.H. Allen.
  6. ^ Rathinasabapathy Elangaiyan et al., 1995. Shompen–Hindi Bilingual Primer Śompen Bhāratī 1. Port Blair and Mysore.
  7. ^ Subhash Chandra Chattopadhyay & Asok Kumar Mukhopadhyay, 2003. The Language of the Shompen of Great Nicobar: a preliminary appraisal. Kolkata: Anthropological Survey of India.
  8. ^ a b Roger Blench & Paul Sidwell, 2011. "Is Shom Pen a Distinct Branch?" In Sophana Srichampa and Paul Sidwell, eds. Austroasiatic Studies: Papers from ICAAL 4. Canberra: Pacific Linguistics. (ICAAL, ms)
  9. ^ a b George van Driem, 2008. "The Shompen of Great Nicobar Island: New linguistic and genetic data, and the Austroasiatic homeland revisited." Mother Tongue, 13:227–247.
  10. ^ Sidwell, Paul. 2017. "Proto-Nicobarese Phonology, Morphology, Syntax: work in progress". International Conference on Austroasiatic Linguistics 7, Kiel, Sept 29-Oct 1, 2017.
  11. ^ "Shompen language and alphabet". Omniglot. Retrieved 2 September 2021.