Portuguese orthography is based on the Latin alphabet and makes use of the acute accent, the circumflex accent, the grave accent, the tilde, and the cedilla to denote stress, vowel height, nasalization, and other sound changes. The diaeresis was abolished by the last Orthography Agreement. Accented letters and digraphs are not counted as separate characters for collation purposes.
The spelling of Portuguese is largely phonemic, but some phonemes can be spelled in more than one way. In ambiguous cases, the correct spelling is determined through a combination of etymology with morphology and tradition; so there is not a perfect one-to-one correspondence between sounds and letters or digraphs. Knowing the main inflectional paradigms of Portuguese and being acquainted with the orthography of other Western European languages can be helpful.
A full list of sounds, diphthongs, and their main spellings is given at Portuguese phonology. This article addresses the less trivial details of the spelling of Portuguese as well as other issues of orthography, such as accentuation.
Only the most frequent sounds appear below since a listing of all cases and exceptions would become cumbersome. Portuguese is a pluricentric language, and pronunciation of some of the letters differs. Apart from those variations, the pronunciation of most consonants is fairly straightforward. Only the consonants r, s, x, z, the digraphs ch, lh, nh, rr, and the vowels may require special attention from English speakers.
Although many letters have more than one pronunciation, their phonetic value is often predictable from their position within a word; that is normally the case for the consonants (except x). Since only five letters are available to write the fourteen vowel sounds of Portuguese, vowels have a more complex orthography, but even then, pronunciation is somewhat predictable. Knowing the main inflectional paradigms of Portuguese can help.
In the following table and in the remainder of this article, the phrase "at the end of a syllable" can be understood as "before a consonant, or at the end of a word". For the letter r, "at the start of a syllable (not between vowels)" means "at the beginning of a word or after l, n, s, or a prefix ending in a consonant". For letters with more than one common pronunciation, their most common phonetic values are given on the left side of the semicolon; sounds after it occur only in a limited number of positions within a word. Sounds separated by "~" are allophones or dialectal variants.
The names of the letters are masculine.
|Name||Name (IPA)||Name||Name (IPA)|
|Bb||bê||/be/||bê||/be/||/b/ or [β] nb 1||bato||[ˈbatʊ]|
|Cc||cê||/se/||cê||/se/||/k/ nb 2; /s/ nb 3||conciso||['kõsi.zu]|
|Dd||dê||/de/||dê||/de/||/d/ ~ [dʒ] nb 4 or [ð] nb 1||dádiva||[ˈdad(ʒ)ivɐ]|
|Ee||é||/ɛ/||é or ê||/ɛ/, /e/||/e/, /ɛ/, /i/ nb 5, /ɨ/, /ɐ/, /ɐi/||rente||[ˈʁẽt(ʃ)i]|
|Ff||efe||/ˈɛfɨ/||efe||/ˈɛfi/||/f/ nb 6||fala||[ˈfala]|
|Gg||gê or guê||/ʒe/, /ɡe/||gê or guê||/ʒe/, /ɡe/||/ɡ/ or [ɣ] nb 1; /ʒ/ nb 3 nb 6||ɡiɡante||[ʒiɡɐ̃t(ʃ)i]|
|Hh||agá||/ɐˈɡa/||agá||/aˈɡa/||natively silent, /ʁ/ in loanwords nb 7||homem||[ˈomẽ]|
|Ii||i||/i/||i||/i/||/i/ nb 5||idade||[iˈdad(ʒ)i]|
|Jj||jota||/ˈʒɔtɐ/||jota||/ˈʒɔtɐ/||/ʒ/ nb 6||janta||[ˈʒɐ̃ta]|
|Kk nb 8||capa||/ˈkapɐ/||cá||/ka/||/k/ nb 2||ketchup||[kɛt͡ʃ(iʃ)ˈup(i)]|
|Ll||ele||/ˈɛlɨ/||ele||/ˈɛli/||/l/ ~ [ɫ ~ w] nb 6 nb 9||lamaçal||[lamaˈsaw]|
|Mm||eme||/ˈɛmɨ/||eme||/ˈemi/||/m/ nb 6 nb 10||mala||[ˈmalɐ]|
|Nn||ene||/ˈɛnɨ/||ene||/ˈeni/||/n/ nb 5 nb 10||ninho||[ˈniɲʊ], [ˈnĩj̃u]|
|Oo||ó||/ɔ/||ó or ô||/ɔ/, /o/||/o/, /ɔ/, /u/ nb 5||óculos||[ˈɔkulus]|
|quê||/ke/||quê||/ke/||/k/ nb 2||quente||[ˈkẽt͡ʃi]|
|Rr||erre or rê||/ˈɛʁɨ/, /ʁe/||erre||/ˈɛʁi/||/ɾ/, /ʁ/ nb 6 nb 11 ~ /h/||raro||[ˈʁaɾu], [ˈhaɾu]|
|Ss||esse||/ˈɛsɨ/||esse||/ˈɛsi/||/s/, /z/ nb 12, [ʃ] nb 13 ~ [ʒ] nb 6||siso||[ˈsizu]|
|Tt||tê||/te/||tê||/te/||/t/ ~ [tʃ] nb 4 or [θ] nb 14||tente||[ˈtẽt͡ʃi]|
|Uu||u||/u/||u||/u/||/u/ nb 5||urubu||[uɾuˈbu]|
|Vv||vê||/ve/||vê||/ve/||/v/ or /β~b/ nb 15||vaca||[ˈvakɐ]|
|Ww nb 8||dâblio or duplo vê||/ˈdɐbliu/, /ˈduplu ˌve/||dáblio||/ˈdabliu/||/w/, /v/||kiwi||[kiˈwi]|
|Xx||xis||/ʃiʃ/||xis||/ʃis/||/ʃ/, /ks/, /z/, /s/, /gz/ nb 12 nb 16||xale||[ˈʃali]|
|Yy nb 8||ípsilon or i grego||/ˈipsɨlɔn/, /ˌi ˈgrɛgu/||ípsilon||/ˈipsilõ/||/j/, /i/|
|Zz||zê||/ze/||zê||/ze/||/z/, /s/, /ʃ/ nb 13 ~ [ʒ]|
Portuguese uses digraphs, pairs of letters which represent a single sound different from the sum of their components. Digraphs are not included in the alphabet.
|lh||/ʎ/, /lʲ/, /lj/|
|gu||/ɡ/; /ɡʷ/; /ɡu/|
The digraphs qu and gu, before e and i, may represent both plain or labialised sounds (quebra /ˈkebɾɐ/, cinquenta /sĩˈkʷẽtɐ/, guerra /ˈɡɛʁɐ/, sagui /saˈɡʷi/), but they are always labialised before a and o (quase, quociente, guaraná). The trema used to be employed to explicitly indicate labialized sounds before e and i (quebra vs. cinqüenta), but since its elimination, such words have to be memorised. Pronunciation divergences mean some of these words may be spelled differently (quatorze / catorze and quotidiano / cotidiano). The digraph ch is pronounced as an English sh by the overwhelming majority of speakers. The digraphs lh and nh, of Occitan origin, denote palatal consonants that do not exist in English. The digraphs rr and ss are used only between vowels. The pronunciation of the digraph rr varies with dialect (see the note on the phoneme /ʁ/, above).
Portuguese makes use of five diacritics: the cedilla (ç), acute accent (á, é, í, ó, ú), circumflex accent (â, ê, ô), tilde (ã, õ), and grave accent (à, rarely ò, formerly also è, ì, and ù).
The cedilla indicates that ç is pronounced /s/ (from a historic palatalization). By convention, s is written instead of etymological ç at the beginning of words, as in "São", the hypocoristic form of the female name "Conceição".
The acute accent and the circumflex accent indicate that a vowel is stressed and the quality of the accented vowel and, more precisely, its height: á, é, and ó are low vowels (except in nasal vowels); â, ê, and ô are high vowels. They also distinguish a few homographs: por "by" with pôr "to put", pode "[he/she/it] can" with pôde "[he/she/it] could".
The tilde marks nasal vowels before glides such as in cãibra and nação, at the end of words, before final -s, and in some compounds: romãzeira "pomegranate tree", from romã "pomegranate", and vãmente "vainly", from vã "vain". It usually coincides with the stressed vowel unless there is an acute or circumflex accent elsewhere in the word or if the word is compound: órgão "organ", irmã + -zinha ("sister" + diminutive suffix) = irmãzinha "little sister". The form õ is used only in the plurals of nouns ending in -ão (nação → nações) and in the second person singular and third person forms of the verb pôr in the present tense (pões, põe, põem).
The grave accent marks the contraction of two consecutive vowels in adjacent words (crasis), normally the preposition a and an article or a demonstrative pronoun: a + aquela = àquela "at that", a + a = à "at the". It can also be used when indicating time: "às 4 horas" = "at 4 o'clock". It does not indicate stress.
Sometimes à and ò are used in other contraction forms, e.g.: cò(s) and cà(s) (from the comparative conjunction ‘than’ and definite articles o and a). (Although, these examples are rare and tend to be called unstandard or dialectal, as well as co(s) and coa/ca(s) from ‘with’ + definite articles). Other examples of its use are: prà, pràs (from para+a/as) and prò, pròs (from para+o/os). According to the orthographic rules of 1990 (adopted only in Portugal, Brazil, and Cabo Verde in 2009), these forms should be spelled without the grave accent.
Some grammatists also used to denote unstressed [ɛ] and [ɔ] as è and ò respectively. This accentuation is not provided by the current orthographical standards.
Until the spelling reforms of 1971 (Brazil) and 1973 (Portugal), the grave accent was also used to denote accents in words with so-called irregular stress after some changes. E.g., in adverbs formed with -mente affix, as well as in some other cases of indication of slightly accented or yet unaccented vowels (mostly because of affixal word formation), all of the vowels can take the grave accent mark, e.g.: provàvelmente, genèricamente, analìticamente, pròpriamente, ùnicamente. The main pattern is to change the acute accent mark, if it graphically exists in any part of the word before the affixation to the grave one, e.g.: in penultimate syllable: notável › notàvelmente; in ultimate syllable: jacaré › jacarèzinho, and so on. The circumflex accent mark did not change: simultâneo/a › simultâneamente.
The graphemes â, ê, ô and é typically represent oral vowels, but before m or n followed by another consonant (or word final -m in the case of ê and é), the vowels represented are nasal. Elsewhere, nasal vowels are indicated with a tilde (ã, õ).
The letters with diaeresis are nowadays practically in disuse. Until 2009 they were still used in Brazilian Portuguese in the combinations güe/qüe and güi/qüi (European Portuguese in this case used the grave accent between 1911 and 1945, then abolished). In old orthography they were also used as in English, French and Dutch to separate diphthongs (e.g.: Raïnha, Luïsa, saüde and so on). The other way to separate diphthongs and non-hiatic vowel combinations is to use acute (as in modern saúde) or circumflex (as in old-style Corôa).
Below are the general rules for the use of the acute accent and the circumflex in Portuguese. Primary stress may fall on any of the three final syllables of a word. A word is called oxytone if it is stressed on its last syllable, paroxytone if stress falls on the syllable before the last (the penult), and proparoxytone if stress falls on the third syllable from the end (the antepenult). Most multisyllabic words are stressed on the penult.
All words stressed on the antepenult take an accent mark. Words with two or more syllables, stressed on their last syllable, are not accented if they have any ending other than -a(s), -e(s), -o(s), -am, -em, -ens; except to indicate hiatus as in açaí. With these endings paroxytonic words must then be accented to differentiate them from oxytonic words, as in amável, lápis, órgão.
Monosyllables are typically not accented, but those whose last vowel is a, e, or o, possibly followed by final -s, -m or -ns, may require an accent mark.
Accentuation rules of Portuguese are somewhat different regarding syllabification than those of Spanish (English "continuous" is Portuguese contínuo, Spanish continuo, and English "I continue" is Portuguese continuo, Spanish continúo, in both cases with the same syllable accented in Portuguese and Spanish).
The use of diacritics in personal names is generally restricted to the combinations above, often also by the applicable Portuguese spelling rules.
Portugal is more restrictive than Brazil in regard to given names. They must be Portuguese or adapted to the Portuguese orthography and sound and should also be easily discerned as either a masculine or feminine name by a Portuguese speaker. There are lists of previously accepted and refused names, and names that are both unusual and not included in the list of previously accepted names must be subject to consultation of the national director of registries. The list of previously accepted names does not include some of the most common names, like "Pedro" (Peter) or "Ana" (Anne). Brazilian birth registrars, on the other hand, are likely to accept names containing any (Latin) letters or diacritics and are limited only to the availability of such characters in their typesetting facility.
Most consonants have the same values as in the International Phonetic Alphabet, except for the palatals /ʎ/ and /ɲ/, which are spelled lh and nh, respectively, and the following velars, rhotics, and sibilants:
|Phoneme||Default||Before e or i|
The alveolar tap /ɾ/ is always spelled as a single r. The other rhotic phoneme of Portuguese, which may be pronounced as a trill [r] or as one of the fricatives [x], [ʁ], or [h], according to the idiolect of the speaker, is either written rr or r, as described below.
|Phoneme||Start of syllable[rhotic note 1]||Between vowels||End of syllable[rhotic note 2]|
|/ʁ/||r||rosa, tenro, guelra||rr||carro||r||sorte, mar|
For the following phonemes, the phrase "at the start of a syllable" can be understood as "at the start of a word, or between a consonant and a vowel, in that order".
|Phoneme||Start of syllable[A]||Between vowels||End of syllable|
|/s/||s, c[B]||sapo, psique,
|ss, ç,[C] c,[B] x[D]||assado, passe,
|s, x,[E] z[F]||isto,|
|/ʃ/||ch, x||chuva, cherne,
|ch, x||fecho, duche,|
|s, z, x[G]||rosa, Brasil, prazo, azeite, exemplo||s, x,[H] z[H]||turismo,|
|/ʒ/||j, g[B]||jogo, jipe,
|j, g[B]||ajuda, pajem,|
Note that there are two main groups of accents in Portuguese, one in which the sibilants are alveolar at the end of syllables (/s/ or /z/), and another in which they are postalveolar (/ʃ/ or /ʒ/). In this position, the sibilants occur in complementary distribution, voiced before voiced consonants, and voiceless before voiceless consonants or at the end of utterances.
The vowels in the pairs /a, ɐ/, /e, ɛ/, /o, ɔ/ only contrast in stressed syllables. In unstressed syllables, each element of the pair occurs in complementary distribution with the other. Stressed /ɐ/ appears mostly before the nasal consonants m, n, nh, followed by a vowel, and stressed /a/ appears mostly elsewhere although they have a limited number of minimal pairs in EP.
In Brazilian Portuguese, both nasal and unstressed vowel phonemes that only contrast when stressed tend to a mid height though [a] may be often heard in unstressed position (especially when singing or speaking emphatically). In pre-20th-century European Portuguese, they tended to be raised to [ə], [i] (now [ɯ̽] except when close to another vowel) and [u]. It still is the case of most Brazilian dialects in which the word elogio may be variously pronounced as [iluˈʒiu], [e̞lo̞ˈʒiu], [e̞luˈʒiu], etc. Some dialects, such as those of Northeastern and Southern Brazil, tend to do less pre-vocalic vowel reduction and in general the unstressed vowel sounds adhere to that of one of the stressed vowel pair, namely [ɛ, ɔ] and [e, o] respectively.
In educated speech, vowel reduction is used less often than in colloquial and vernacular speech though still more than the more distant dialects, and in general, mid vowels are dominant over close-mid ones and especially open-mid ones in unstressed environments when those are in free variation (that is, sozinho is always [sɔˈzĩɲu], even in Portugal, while elogio is almost certainly [e̞lo̞ˈʒi.u]). Mid vowels are also used as choice for stressed nasal vowels in both Portugal and Rio de Janeiro though not in São Paulo and southern Brazil, but in Bahia, Sergipe and neighboring areas, mid nasal vowels supposedly are close-mid like those of French. Veneno can thus vary as EP [vɯ̽ˈne̞nu], RJ [vẽ̞ˈnẽ̞nu], SP [veˈnenʊ] and BA [vɛˈnɛ̃nu] according to the dialect. /ɐ̃/ also has significant variation, as shown in the respective dialect pronunciations of banana as [baˈnə̃nə], [bə̃ˈnə̃nə], and [bəˈnənə].
Vowel reduction of unstressed nasal vowels is extremely pervasive nationwide in Brazil, in vernacular, colloquial and even most educated speech registers. It is slightly more resisted but still present in Portugal.
The pronunciation of the accented vowels is fairly stable except that they become nasal in certain conditions. See #Nasalization for further information about this regular phenomenon. In other cases, nasal vowels are marked with a tilde.
The grave accent is used only on the letter a and is merely grammatical, meaning a crasis between two a such as the preposition "to" and the feminine article "the" (vou a cidade → vou à cidade "I'm going to the city"). In dialects where unstressed a is pronounced /ɐ/, à is pronounced /a/; in dialects where unstressed a is /a/ the grave accent makes no difference in pronunciation.
There was a proposal to use the grave for separation of unstressed diphtongues, e.g.: saìmento, paìsagem, saùdar.
The trema was official prior to the last orthographical reform and can still be found in older texts. It meant that the usually silent u between q or g and i or e is in fact pronounced: líqüido “liquid” and sangüíneo “related to blood”. Some words have two acceptable pronunciations, varying largely by accents.
It was also proposed to use the grave accent instead of trema, e.g.: líqùido, sangùíneo.
The pronunciation of each diphthong is also fairly predictable, but one must know how to distinguish true diphthongs from adjacent vowels in hiatus, which belong to separate syllables. For example, in the word saio /ˈsaiu/ ([ˈsaj.ju]), the i forms a clearer diphthong with the previous vowel (but a slight yod also in the next syllable is generally present), but in saiu /sɐˈiu/ ([sɐˈiw]), it forms a diphthong with the next vowel. As in Spanish, a hiatus may be indicated with an acute accent, distinguishing homographs such as saia /ˈsaiɐ/ ([ˈsaj.jɐ]) and saía /sɐˈiɐ/.
|ai, ái||[ai]||au, áu||[au]|
|ei, êi||[ei ~ eː], [əi][i]||eu, êu||[eu]|
|oi||[oi]||ou||[ou ~ oː]|
|ei, éi||[ɛi], [əi][i]||eu, éu||[ɛu]|
When a syllable ends with m or n, the consonant is not fully pronounced but merely indicates the nasalization of the vowel which precedes it. At the end of words, it generally produces a nasal diphthong.
|-un, -um, -ún, -úm[a]||/ũ/|
|-on, -om, -ôn, -ôm[a]||/õ/|
|-an, -am, -ân, -âm[b]||/ɐ̃/||-am[c]||/ɐ̃ũ/|
|-en, -em, -ên, -êm[b]||/ẽ/||-em, -êm[c]||-en-[d]||/ẽĩ/ ([ɐ̃ĩ])|
|-in, -im, -ín, -ím[a]||/ĩ/|
The letter m is conventionally written before b or p or at the end of words (also in a few compound words such as comummente - comumente in Brazil), and n is written before other consonants. In the plural, the ending -m changes into -ns; for example bem, rim, bom, um → bens, rins, bons, uns. Some loaned words end with -n (which is usually pronounced in European Portuguese).
Nasalization of ui, according to modern orthography, is left unmarked in the six words muito, muita, muitos, muitas, mui, ruim (the latter one only in Brazilian Portuguese). During some periods, the nasal ui was marked as ũi: mũi, mũita, mũito, mũitas, mũitos.
The word endings -am, -em, -en(+s), with or without an accent mark on the vowel, represent nasal diphthongs derived from various Latin endings, often -ant, -unt or -en(t)-. Final -am, which appears in polysyllabic verbs, is always unstressed. The grapheme -en- is also pronounced as a nasal diphthong in a few compound words, such as bendito (bem + dito), homenzinho (homem + zinho), and Benfica.
Verbs whose infinitive ends in -jar have j in the whole conjugation: viagem "voyage" (noun) but viajem (third person plural of the present subjunctive of the verb viajar "to travel").
Verbs whose thematic vowel becomes a stressed i in one of their inflections are spelled with an i in the whole conjugation, as are other words of the same family: crio (I create) implies criar (to create) and criatura (creature).
Verbs whose thematic vowel becomes a stressed ei in one of their inflections are spelled with an e in the whole conjugation, as are other words of the same family: nomeio (I nominate) implies nomear (to nominate) and nomeação (nomination).
The majority of the Portuguese lexicon is derived from Latin, Celtic, Greek, some Germanic and some Arabic. In principle, that would require some knowledge of those languages. However, Greek words are Latinized before being incorporated into the language, and many words of Latin or Greek origin have easily recognizable cognates in English and other western European languages and are spelled according to similar principles. For instance, glória, "glory", glorioso, "glorious", herança "inheritance", real "real/royal". Some general guidelines for spelling are given below:
Loanwords with a /ʃ/ in their original languages receive the letter x to represent it when they are nativised: xampu (shampoo). While the pronunciations of ch and x merged long ago, some Galician-Portuguese dialects like the Galician language, the portunhol da pampa and the speech registers of northeastern Portugal still preserve the difference as ch /tʃ/ vs. x /ʃ/, as do other Iberian languages and Medieval Portuguese. When one wants to stress the sound difference in dialects in which it merged the convention is to use tch: tchau (ciao) and Brazilian Portuguese República Tcheca (Czech Republic). In most loanwords, it merges with /ʃ/ (or /t/ :moti for mochi), just as [dʒ] most often merges with /ʒ/. Alveolar affricates [ts] and [dz], though, are more likely to be preserved (pizza, Zeitgeist, tsunami, kudzu, adzuki, etc.), although not all of these hold up across some dialects (/zaitʃiˈgaiʃtʃi/ for Zeitgeist, /tʃisuˈnɐ̃mi/ for tsunami and /aˈzuki/ for adzuki [along with spelling azuki])
Portuguese syllabification rules require a syllable break between double letters: cc, cç, mm, nn, rr, ss, or other combinations of letters that may be pronounced as a single sound: fric-ci-o-nar, pro-ces-so, car-ro, ex-ce(p)-to, ex-su-dar. Only the digraphs ch, lh, nh, gu, qu, and ou are indivisible. All digraphs are however broken down into their constituent letters for the purposes of collation, spelling aloud, and in crossword puzzles.
The apostrophe (') appears as part of certain phrases, usually to indicate the elision of a vowel in the contraction of a preposition with the word that follows it: de + água = d'água. It is used almost exclusively in poetry.
The hyphen (-) is used to make compound words, especially plants and animal names like papagaio-de-rabo-vermelho "red-tailed parrot".
It is also extensively used to append clitic pronouns to the verb, as in quero-o "I want it" (enclisis), or even to embed them within the verb (mesoclisis), as in levaria + vos + os = levar-vo-los-ia "I would take them to you". Proclitic pronouns are not connected graphically to the verb: não o quero "I do not want it". Each element in such compounds is treated as an individual word for accentuation purposes: matarias + o = matá-lo-ias "You would kill it/him", beberá + a = bebê-la-á "He/she will drink it".
In European Portuguese, as in many other European languages, angular quotation marks are used for general quotations in literature:
Although American-style (“…”) or British-style (‘…’) quotation marks are sometimes used as well, especially in less formal types of writing (they are more easily produced in keyboards) or inside nested quotations, they are less common in careful writing. In Brazilian Portuguese, only American and British-style quote marks are used.
In both varieties of the language, dashes are normally used for direct speech rather than quotation marks:
Further information: Spelling reforms of Portuguese
Prior to the Portuguese Language Orthographic Agreement of 1990, Portuguese had two orthographic standards:
The table to the right illustrates typical differences between the two orthographies. Some are due to different pronunciations, but others are merely graphic. The main ones are:
As of 2016, the reformed orthography (1990 Agreement) is obligatory in Brazil, Cape Verde, and Portugal.
|Convention||Portuguese-speaking countries except Brazil before the 1990 agreement||Brazil before the 1990 agreement||All countries after the 1990 agreement||translation|
|Different pronunciation||anónimo||anônimo||Both forms remain||anonymous|
|Vénus||Vênus||Both forms remain||Venus|
|facto||fato||Both forms remain||fact|
|Non-personal and non-geographical names||Janeiro||janeiro||janeiro||January|