Google Ngrams chart showing the changing English romanization of the Arabic short vowels (ـَ, ـِ and ـُ) between the 19th and 20th centuries, using مُسْلِم (Muslim) and مُحَمَّد (Muhammad) as examples.

The romanization of Arabic is the systematic rendering of written and spoken Arabic in the Latin script. Romanized Arabic is used for various purposes, among them transcription of names and titles, cataloging Arabic language works, language education when used instead of or alongside the Arabic script, and representation of the language in scientific publications by linguists. These formal systems, which often make use of diacritics and non-standard Latin characters and are used in academic settings or for the benefit of non-speakers, contrast with informal means of written communication used by speakers such as the Latin-based Arabic chat alphabet.

Different systems and strategies have been developed to address the inherent problems of rendering various Arabic varieties in the Latin script. Examples of such problems are the symbols for Arabic phonemes that do not exist in English or other European languages; the means of representing the Arabic definite article, which is always spelled the same way in written Arabic but has numerous pronunciations in the spoken language depending on context; and the representation of short vowels (usually i u or e o, accounting for variations such as Muslim/Moslem or Mohammed/Muhammad/Mohamed).


Romanization is often termed "transliteration", but this is not technically correct.[1] Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actually transcription systems, which represent the sound of the language, since short vowels and geminate consonants, for example, do not usually appear in Arabic writing. As an example, the above rendering munāẓaratu l-ḥurūfi l-ʻarabīyah of the Arabic: مناظرة الحروف العربية is a transcription, indicating the pronunciation; an example transliteration would be mnaẓrḧ alḥrwf alʻrbyḧ.

Romanization standards and systems

Principal standards and systems are:

Early Romanization

The Christian doctrine in Arabic and Castilian language (1566) by Pérez de Ayala uses an innovative system for transcribing Valencian Arabic that has been called "the first Western system of Arabic scientific transcription" by Federico Corriente.[2]

Early Romanization of the Arabic language was standardized in the various bilingual Arabic-European dictionaries of the 17–19th centuries:

Mixed digraphic and diacritical

Further information: Digraph (orthography) and Diacritic

Fully diacritical


Comparison table

Letter Unicode Name IPA BGN/
UNGEGN ALA-LC EI Wehr 1 EALL BS DIN ISO ArabTeX Arabizi 2[19][20][21]
ء3 0621 hamzah ʔ ʼ 4 ʾ ʼ 4 ʾ ʼ 4 ʾ ˈˌ ' 2
ا 0627 alif ā ʾ A a/e/é
ب 0628 ʼ b b
ت 062A ʼ t t
ث 062B thāʼ θ th (t͟h)5 _t s/th/t
ج12 062C jīm d͡ʒ~ɡ~ʒ j dj (d͟j)5 j 6 ǧ ^g j/g/dj
ح 062D ḥāʼ ħ  7 .h 7/h
خ 062E khāʼ x kh (k͟h)5  6 x _h kh/7'/5
د 062F dāl d d
ذ 0630 dhāl ð dh (d͟h)5 _d z/dh/th/d
ر 0631 ʼ r r
ز 0632 zayn/zāy z z
س 0633 sīn s s
ش 0634 shīn ʃ sh (s͟h)5 š ^s sh/ch/$
ص 0635 ṣād ş 7 .s s/9
ض 0636 ḍād  7 .d d/9'/D
ط 0637 ṭāʼ ţ 7 .t t/6/T
ظ 0638 ẓāʼ ðˤ~  7 ḏ̣/ẓ11 .z z/dh/6'/th
ع 0639 ʻayn ʕ ʻ 4 ʿ ʽ 4 ʿ ` 3
غ 063A ghayn ɣ gh (g͟h)5  6 ġ ġ .g gh/3'/8
ف8 0641 ʼ f f
ق8 0642 qāf q q 2/g/q/8/9
ك 0643 kāf k k
ل 0644 lām l l
م 0645 mīm m m
ن 0646 nūn n n
ه 0647 ʼ h h
و 0648 wāw w, w; ū w; U w/ou/oo/u/o
ي9 064A ʼ j, y; ī y; I y/i/ee/ei/ai
آ 0622 alif maddah ʔaː ā, ʼā ʾā ʾâ 'A 2a/aa
ة 0629 ʼ marbūṭah h, t h; t —; t h; t T a/e(h); et/at
ال 06270644 alif lām (var.) al- 10 ʾal al- el/al
ى9 0649 alif maqṣūrah á ā _A a
ـَ 064E fatḥah a a a/e/é
ـِ 0650 kasrah i i i/e/é
ـُ13 064F ḍammah u u ou/o/u
ـَا 064E0627 fatḥah alif ā A/aa a
ـِي 0650064A kasrah yāʼ ī iy I/iy i/ee
ـُو13 064F0648 ḍammah wāw ū uw U/uw ou/oo/u
ـَي 064E064A fatḥah yāʼ aj ay ay/ai/ey/ei
ـَو 064E0648 fatḥah wāw aw aw aw/aou
ـً14 064B fatḥatān an an an á aN an
ـٍ14 064D kasratān in in in í iN in/en
ـٌ14 064C ḍammatān un un un ú uN oun/on/oon/un

Romanization issues

Any romanization system has to make a number of decisions which are dependent on its intended field of application.


One basic problem is that written Arabic is normally unvocalized; i.e., many of the vowels are not written out, and must be supplied by a reader familiar with the language. Hence unvocalized Arabic writing does not give a reader unfamiliar with the language sufficient information for accurate pronunciation. As a result, a pure transliteration, e.g., rendering قطر as qṭr, is meaningless to an untrained reader. For this reason, transcriptions are generally used that add vowels, e.g. qaṭar. However, unvocalized systems match exactly to written Arabic, unlike vocalized systems such as Arabic chat, which some claim detracts from one's ability to spell.[22]

Transliteration vs. transcription

Most uses of romanization call for transcription rather than transliteration: Instead of transliterating each written letter, they try to reproduce the sound of the words according to the orthography rules of the target language: Qaṭar. This applies equally to scientific and popular applications. A pure transliteration would need to omit vowels (e.g. qṭr), making the result difficult to interpret except for a subset of trained readers fluent in Arabic. Even if vowels are added, a transliteration system would still need to distinguish between multiple ways of spelling the same sound in the Arabic script, e.g. alif ا vs. alif maqṣūrah ى for the sound /aː/ ā, and the six different ways (ء إ أ آ ؤ ئ) of writing the glottal stop (hamza, usually transcribed ʼ ). This sort of detail is needlessly confusing, except in a very few situations (e.g., typesetting text in the Arabic script).

Most issues related to the romanization of Arabic are about transliterating vs. transcribing; others, about what should be romanized:

A transcription may reflect the language as spoken, typically rendering names, for example, by the people of Baghdad (Baghdad Arabic), or the official standard (Literary Arabic) as spoken by a preacher in the mosque or a TV newsreader. A transcription is free to add phonological (such as vowels) or morphological (such as word boundaries) information. Transcriptions will also vary depending on the writing conventions of the target language; compare English Omar Khayyam with German Omar Chajjam, both for عمر خيام /ʕumar xajjaːm/, [ˈʕomɑr xæjˈjæːm] (unvocalized ʿmr ḫyām, vocalized ʻUmar Khayyām).

A transliteration is ideally fully reversible: a machine should be able to transliterate it back into Arabic. A transliteration can be considered as flawed for any one of the following reasons:

A fully accurate transcription may not be necessary for native Arabic speakers, as they would be able to pronounce names and sentences correctly anyway, but it can be very useful for those not fully familiar with spoken Arabic and who are familiar with the Roman alphabet. An accurate transliteration serves as a valuable stepping stone for learning, pronouncing correctly, and distinguishing phonemes. It is a useful tool for anyone who is familiar with the sounds of Arabic but not fully conversant in the language.

One criticism is that a fully accurate system would require special learning that most do not have to actually pronounce names correctly, and that with a lack of a universal romanization system they will not be pronounced correctly by non-native speakers anyway. The precision will be lost if special characters are not replicated and if a reader is not familiar with Arabic pronunciation.


Examples in Literary Arabic:

Arabic أمجد كان له قصر إلى المملكة المغربية
Arabic with diacritics
(normally omitted)
أَمْجَدُ كَانَ لَهُ قَصْر إِلَى الْمَمْلَكَةِ الْمَغْرِبِيَّة
IPA [/ʔamdʒadu kaːna lahuː qasˤr/] [/ʔila‿l.mamlakati‿l.maɣribij.jah/]
ALA-LC Amjad kāna lahu qaṣr Ilá al-mamlakah al-Maghribīyah
Hans Wehr amjad kāna lahū qaṣr ilā l-mamlaka al-maḡribīya
DIN 31635 ʾAmǧad kāna lahu qaṣr ʾIlā l-mamlakah al-Maġribiyyah
UNGEGN Amjad kāna lahu qaşr Ilá al-mamlakah al-maghribiyyah
ISO 233 ʾˈamǧad kāna lahu qaṣr ʾˈilaỳ ʾˈalmamlakaẗ ʾˈalmaġribiȳaẗ
ArabTeX am^gad kAna lahu il_A almamlakaT alma.gribiyyaT
English Amjad had a palace To the Moroccan Kingdom

Arabic alphabet and nationalism

There have been many instances of national movements to convert Arabic script into Latin script or to romanize the language.


LEBNAAN in proposed Said Akl alphabet (issue #686)

A Beirut newspaper, La Syrie, pushed for the change from Arabic script to Latin script in 1922. The major head of this movement was Louis Massignon, a French Orientalist, who brought his concern before the Arabic Language Academy in Damascus in 1928. Massignon's attempt at romanization failed as the Academy and the population viewed the proposal as an attempt from the Western world to take over their country. Sa'id Afghani, a member of the Academy, asserted that the movement to romanize the script was a Zionist plan to dominate Lebanon.[24][25]


After the period of colonialism in Egypt, Egyptians were looking for a way to reclaim and reemphasize Egyptian culture. As a result, some Egyptians pushed for an Egyptianization of the Arabic language in which the formal Arabic and the colloquial Arabic would be combined into one language and the Latin alphabet would be used.[24][25] There was also the idea of finding a way to use hieroglyphics instead of the Latin alphabet.[24][25] A scholar, Salama Musa, agreed with the idea of applying a Latin alphabet to Egyptian Arabic, as he believed that would allow Egypt to have a closer relationship with the West. He also believed that Latin script was key to the success of Egypt as it would allow for more advances in science and technology. This change in script, he believed, would solve the problems inherent with Arabic, such as a lack of written vowels and difficulties writing foreign words.[24][25][26] Ahmad Lutfi As Sayid and Muhammad Azmi, two Egyptian intellectuals, agreed with Musa and supported the push for romanization.[24][25] The idea that romanization was necessary for modernization and growth in Egypt continued with Abd Al Aziz Fahmi in 1944. He was the chairman for the Writing and Grammar Committee for the Arabic Language Academy of Cairo.[24][25] He believed and desired to implement romanization in a way that allowed words and spellings to remain somewhat familiar to the Egyptian people. However, this effort failed as the Egyptian people felt a strong cultural tie to the Arabic alphabet, particularly the older generation.[24][25]

See also


  1. ^ Adegoke, Kazeem Adekunle; Abdulraheem, Bashir (7 June 2017). "Re-Thinking Romanization of Arabic-Islamic Script". TARBIYA: Journal of Education in Muslim Society. 4 (1): 22–31. doi:10.15408/tjems.v4i1.5549. ISSN 2442-9848.
  2. ^ «La lengua de la gente común y no los priores de la gramática arábiga».La Doctrina christiana en lengua arábiga y castellana (1566) de Martín Pérez de Ayala, Teresa Soto González, University of Salamanca (in Spanish)
  3. ^ a b c d e f Edward Lipiński, 2012, Arabic Linguistics: A Historiographic Overview, pages 32–33
  4. ^ Pérez de Ayala, Martín (1556). Christian doctrine in the Arabic-Spanish language. Valencia.
  5. ^ "Romanization system for Arabic. BGN/PCGN 1956 System" (PDF).
  6. ^ a b c d "Arabic" (PDF). UNGEGN.
  7. ^ Technical reference manual for the standardization of geographical names (PDF). UNGEGN. 2007. p. 12 [22].
  8. ^ "Systèmes français de romanisation" (PDF). UNGEGN. 2009.
  9. ^ "Arabic romanization table" (PDF). The Library of Congress.
  10. ^ "IJMES Translation & Transliteration Guide". International Journal of Middle East Studies.
  11. ^ "Encyclopaedia of Islam Romanization vs ALA Romanization for Arabic". University of Washington Libraries.
  12. ^ Brockelmann, Carl; Ronkel, Philippus Samuel van (1935). Die Transliteration der arabischen Schrift... (PDF). Leipzig.((cite book)): CS1 maint: location missing publisher (link)
  13. ^ a b Reichmuth, Philipp (2009). "Transcription". In Versteegh, Kees (ed.). Encyclopedia of Arabic Language and Linguistics. Vol. 4. Brill. pp. 515–20.
  14. ^ Millar, M. Angélica; Salgado, Rosa; Zedán, Marcela (2005). Gramatica de la lengua arabe para hispanohablantes. Santiago de Chile: Editorial Universitaria. pp. 53–54. ISBN 978-956-11-1799-0.
  15. ^ "Standards, Training, Testing, Assessment and Certification". BSI Group. Archived from the original on 7 October 2008. Retrieved 18 May 2014.
  16. ^ ArabTex User Manual Section 4.1 : ASCII Transliteration Encoding.
  17. ^ "Buckwalter Arabic Transliteration". QAMUS LLC.
  18. ^ "Arabic Morphological Analyzer/The Buckwalter Transliteration". Xerox. Archived from the original on 26 September 2015. Retrieved 30 April 2017.
  19. ^ Sullivan, Natalie (July 2017). Writing Arabizi: Orthographic Variation in Romanized Lebanese Arabic on Twitter (Plan II Honors Thesis). doi:10.15781/T2W951823. hdl:2152/72420.
  20. ^ Bjørnsson, Jan Arild (November 2010). "Egyptian Romanized Arabic: A Study of Selected Features from Communication Among Egyptian Youth on Facebook" (PDF). University of Oslo. Retrieved 31 March 2019.
  21. ^ Abu Elhija, Dua'a (3 July 2014). "A new writing system? Developing orthographies for writing Arabic dialects in electronic media". Writing Systems Research. 6 (2): 190–214. doi:10.1080/17586801.2013.868334. ISSN 1758-6801. S2CID 219568845.
  22. ^ "Arabizi sparks concern among educators". 9 May 2013. Retrieved 18 May 2014.
  23. ^ "Arabic" (PDF). ALA-LC Romanization Tables. Library of Congress. p. 9. Retrieved 14 June 2013. 21. The prime (ʹ) is used: (a) To separate two letters representing two distinct consonantal sounds, when the combination might otherwise be read as a digraph.
  24. ^ a b c d e f g Shrivtiel, Shraybom (1998). The Question of Romanisation of the Script and The Emergence of Nationalism in the Middle East. Mediterranean Language Review. pp. 179–196.
  25. ^ a b c d e f g History of Arabic Writing
  26. ^ Shrivtiel, p. 188