The difference between superscript/subscript and numerator/denominator glyphs. In many popular fonts the Unicode "superscript" and "subscript" characters are actually numerator and denominator glyphs.
The difference between superscript/subscript and numerator/denominator glyphs. In many popular fonts the Unicode "superscript" and "subscript" characters are actually numerator and denominator glyphs.

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals.[1] These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:

When used in mathematical context (MathML) it is recommended to consistently use style markup for superscripts and subscripts.... However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or phonemic transcription.[2]

Uses

The intended use[2] when these characters were added to Unicode was to allow chemical and algebra formulas and phonetics to be written without markup, but produce true superscripts and subscripts. Thus "H₂O" (using a subscript character) is supposed to be identical to "H2O" (with subscript markup).

In reality most fonts that include these characters ignore the Unicode definition, and design the digits for mathematical numerator and denominator glyphs,[3][4] which are smaller than normal characters but are aligned with the cap line and the baseline, respectively. When used with the solidus, these glyphs are useful for making arbitrary diagonal fractions (similar to the ½ glyph). Making fractions using existing software super/subscripts requires many characters and does not look like the rendered fraction (example: 1/2), so font designers provided this alternative. This also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters. However it makes them incorrect for normal super and subscripts, and formulas are rendered correctly by using markup rather than these characters.

Unicode intended to produce diagonal fractions through a different mechanism but it is very poorly supported. The fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits (not the superscripts and subscripts) is intended to tell a layout system that a fraction such as ¾ should be rendered[5] using automatic glyph substitution[a] for the digits. Some browsers support this[b] but not in all fonts. A selection of fonts is shown in the below table.

Comparison of encodings of simple fractions
Characters Font Result
U+00BD ½ VULGAR FRACTION ONE HALF Default ½
U+00B9 ¹ SUPERSCRIPT ONE, U+002F / SOLIDUS, U+2082 SUBSCRIPT TWO ¹/₂
U+00B9 ¹ SUPERSCRIPT ONE, U+2044 FRACTION SLASH, U+2082 SUBSCRIPT TWO ¹⁄₂
U+0031 1 DIGIT ONE,
U+2044 FRACTION SLASH,
U+0032 2 DIGIT TWO
1⁄2
Arial 1⁄2
Cambria 1⁄2
Consolas 1⁄2
Times New Roman 1⁄2
FiraGO 1⁄2
EB Garamond 1⁄2
Cantarell 1⁄2
Lato 1⁄2
Linux Libertine O 1⁄2
Nimbus Roman 1⁄2
Ubuntu 1⁄2
Yrsa 1⁄2

Superscripts and subscripts block

Main article: Superscripts and Subscripts (Unicode block)

The most common superscript digits (1, 2, and 3) were in ISO-8859-1 and were therefore carried over into those positions in the Latin-1 range of Unicode. The rest were placed in a dedicated section of Unicode at U+2070 to U+209F. The two tables below show these characters. Each superscript or subscript character is preceded by a normal x to show the subscripting/superscripting. The table on the left contains the actual Unicode characters; the one on the right contains the equivalents using HTML markup for the subscript or superscript.

Unicode characters
0 1 2 3 4 5 6 7 8 9 A B C D E F
U+00Bx
U+207x x⁰ xⁱ x⁴ x⁵ x⁶ x⁷ x⁸ x⁹ x⁺ x⁻ x⁼ x⁽ x⁾ xⁿ
U+208x x₀ x₁ x₂ x₃ x₄ x₅ x₆ x₇ x₈ x₉ x₊ x₋ x₌ x₍ x₎
U+209x xₐ xₑ xₒ xₓ xₔ xₕ xₖ xₗ xₘ xₙ xₚ xₛ xₜ
Simulated using <sup> or <sub> tags
0 1 2 3 4 5 6 7 8 9 A B C D E F
U+00Bx x2 x3 x1
U+207x x0 xi x4 x5 x6 x7 x8 x9 x+ x x= x( x) xn
U+208x x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x+ x x= x( x)
U+209x xa xe xo xx xə xh xk xl xm xn xp xs xt
  Reserved for future use.
  Other characters from Latin-1 not related to super- or sub-scripts.

Other superscript and subscript characters

Unicode version 15.0 also includes subscript and superscript characters that are intended for semantic usage, in the following blocks:[1][6]

Superscript
Combining superscript
Subscript
Combining subscript

Latin, Greek and Cyrillic tables

See also: superscript IPA letters

Consolidated, the Unicode standard contains superscript and subscript versions of a subset of Latin, Greek and Cyrillic letters. Here they are arranged in alphabetical order for comparison (or for copy and paste convenience). Since these characters appear in different Unicode ranges, they may not appear to be the same size or position due to font substitution in the browser. Shaded cells mark small capitals that are not very distinct from minuscules, and Greek letters that are indistinguishable from Latin, and so would not be expected to be supported by Unicode.

Latin superscript and subscript letters
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Superscript capital ᴿ
Superscript small cap 𐞄 𐞒 𐞖 𐞪 𐞲
Superscript minuscule ʰ ʲ ˡ 𐞥 ʳ ˢ ʷ ˣ ʸ
Overscript capital ◌ᷛ ◌ᷞ ◌ᷟ ◌ᷡ ◌ᷢ
Overscript minuscule ◌ͣ ◌ᷨ ◌ͨ ◌ͩ ◌ͤ ◌ᷫ ◌ᷚ ◌ͪ ◌ͥ ◌ᷜ ◌ᷝ ◌ͫ ◌ᷠ ◌ͦ ◌ᷮ ◌ͬ ◌ᷤ ◌ͭ ◌ͧ ◌ͮ ◌ᷱ ◌ͯ ◌ᷦ
Subscript minuscule
Underscript minuscule ◌᷊ ◌ᪿ
Greek superscript and subscript letters
Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω
Superscript minuscule ⁽ᵋ⁾ ᶿ ⁽ᶥ⁾ ⁽ᶹ⁾
Overscript minuscule ◌ᷩ
Subscript minuscule
Other IPA superscript and subscript letters
ɑ æ ç ð ə ʃ ʍ ʔ
Superscript See superscript IPA letters
Overscript ◌ᷧ ◌ᷔ ◌ᷗ ◌ᷙ ◌ᷪ ◌ᷯ ◌̉
Subscript
Underscript ◌ᫀ

(Superscript ɩ ᶅ ƫ ɷ, which are no longer IPA, are ⟨ᶥ ᶪ ᶵ 𐞤⟩.)

Cyrillic superscript and subscript letters
А Ә Б В Г Ґ Д Е Є Ж З Ѕ И І Ї Ј К Л М Н О Ө П Р С Ҫ
Superscript 𞀰 𞁋 𞀱 𞀲 𞀳 𞀴 𞀵 𞀶 𞀷 𞁊 𞀸 𞁌 𞁍 𞀹 𞀺 𞀻 𞀼 𞁎 𞀽 𞀾 𞀿 𞁫
Overscript ◌ⷶ ◌ⷠ ◌ⷡ ◌ⷢ ◌ⷣ ◌ⷷ ◌ꙴ ◌ⷤ ◌ⷥ ◌ꙵ ◌𞂏 ◌ꙶ ◌ⷦ ◌ⷧ ◌ⷨ ◌ⷩ ◌ⷪ ◌ⷫ ◌ⷬ ◌ⷭ
Subscript 𞁑 𞁒 𞁓 𞁔 𞁧 𞁕 𞁖 𞁗 𞁘 𞁩 𞁙 𞁨 𞁚 𞁛 𞁜 𞁝 𞁞
Т У Ү Ұ Ф Х Ѡ Ц Ч Џ Ш Щ Ъ Ы Ь Ѣ Э Ю Ѥ Ѧ Ѫ Ѭ Ѳ Ӏ
Superscript 𞁀 𞁁 𞁏 𞁭 𞁂 𞁃 𞁄 𞁅 𞁆 𞁬 𞁇 𞁈 𞁉 𞁐
Overscript ◌ⷮ ◌ꙷ ◌ⷹ ◌ꚞ ◌ⷯ ◌ꙻ ◌ⷰ ◌ⷱ ◌ⷲ ◌ⷳ ◌ꙸ ◌ꙹ ◌ꙺ ◌ⷺ ◌ⷻ ◌ⷼ ◌ꚟ ◌ⷽ ◌ⷾ ◌ⷿ ◌ⷴ
Subscript 𞁟 𞁠 𞁡 𞁢 𞁣 𞁪 𞁤 𞁥 𞁦

Many of these characters were added to Unicode 15, in the Cyrillic Extended-D block, and published in 2022.[8]

See also small caps in Unicode.

Composite characters

Primarily for compatibility with earlier character sets, Unicode contains a number of characters that compose super- and subscripts with other symbols.[1] In most fonts these render much better than attempts to construct these symbols from the above characters or by using markup.

Notes

  1. ^ For a general overview and technical information on glyph substitution (though not specifically for fractions): GSUB — Glyph Substitution Table in the OpenType specification on the Microsoft Typography site.
  2. ^ Such as Chrome on Windows, Firefox[failed verification]

References

  1. ^ a b c "UCD: UnicodeData.txt". The Unicode Standard. Retrieved 2016-05-14.
  2. ^ a b Martin Dürst, Asmus Freytag (16 May 2007). "Unicode in XML and other Markup Languages". W3C. Retrieved 13 September 2010.
  3. ^ "fraction | Dart Package". Dart packages. 27 December 2021. Retrieved 21 September 2022.
  4. ^ "MathML | General layout elements | Fractions". data2type GmbH (in German). 30 March 2021. Retrieved 13 January 2022.((cite web)): CS1 maint: url-status (link)
  5. ^ Martin Dürst, Asmus Freytag (16 May 2007). "Fraction Slash". W3C. Retrieved 13 September 2010.
  6. ^ "UCD: Scripts.txt". The Unicode Standard. Retrieved 2022-09-21.
  7. ^ Everson, Michael; West, Andrew (2020-10-05). "L2/20-268: Revised proposal to add ten characters for Middle English to the UCS" (PDF).
  8. ^ Cyrillic Extended-D. Range: 1E030–1E08F
  9. ^ Silva, Eduardo Marín (2017-03-01). "L2/17-066R: Proposal to encode the Marca Registrada sign" (PDF).