Linguistic notes

Wherever possible, modern orthography has been used for names, titles and lyrics. For languages without a widely accepted Latin orthography, the following Romanisation systems have been employed:

Language(s) Romanisation system
Abkhaz BGN/PCGN (2011)
Adyghe own Romanisation, see below
Amharic BGN/PCGN (1967)
Arabic EI3 (2007), with modifications1
Armenian BGN/PCGN (1981)
Bashkort national (2007 or earlier)
Bulgarian BGN/PCGN (1952)
Burmese Okell (1969), with modifications2
Buryat own Romanisation, see below
Cantonese Yale (1958)
Church Slavonic own Romanisation, see below
Dravidian and Indo-Aryan languages own Romanisation, see below
Dzongkha van Driem (2019)
Georgian BGN/PCGN (1981)
Greek ELOT 743 Type 2 (2001)
Hebrew Academy of the Hebrew Language (2006)3
Ingush own Romanisation, see below
Japanese Hepburn (1908)
Khmer BGN/PCGN (1972)
Kildin Sámi Rießler (2022)
Komi own Romanisation, see below
Korean McCune–Reischauer (1939)
Lao CNT (1966 or earlier)
Mandarin Hànyǔ Pīnyīn (1958)
Mari own Romanisation, see below
Mongolian BGN/PCGN (1964)
Mordvinic languages own Romanisation, see below
Ossetian BGN/PCGN (2009)
Pashto BGN/PCGN (2017), with modifications4
Persian (including Dari and Tajik) UniPers (2003)5
Russian BGN/PCGN (1947)6
Syriac BGN/PCGN (2011)
Tatar national (2012)
Thai RTGS (1999), with modifications7
Tibetan own Romanisation, see below
Tigrinya BGN/PCGN (2007)
Udmurt BGN/PCGN (2011)
Ukrainian BGN/PCGN (1965)
Yiddish YIVO
other Turkic languages Common Turkic Alphabet (2024)
  1. Apostrophes are used instead of half-rings, and all sun letter assimilations are applied.
  2. The voicing of consonants is indicated according to their actual pronunciation.
  3. If a transcription based on Ashkenazi pronunciation is required, tsere is Romanised as ⟨ey⟩; kamats gadol as ⟨o⟩; ẖolam as ⟨ô⟩; and ungeminated tav as ⟨s⟩.
  4. The transcription has been simplified by omitting silent consonants and diacritics that do not affect the pronunciation.
  5. The following extensions are used for Dari and Tajik: ⟨ğ⟩ for the sound /ɣ~ʁ/ (distinct from /q/); ⟨î û⟩ for the long vowels /iː uː/ in Dari; and ⟨ü⟩ for the phoneme /ʉ/ in Tajik.
  6. If an unambiguous representation of the pre-revolutionary orthography is required, ⟪і⟫ is romanised as ⟨ī⟩ and ⟪ѣ⟫ as ⟨ě⟩.
  7. The following extensions have been taken from the 1939 version of the system: ⟨čh⟩ for unaspirated /tɕ/; and ⟨ǫ⟩ for the open vowel /ɔ/.

Adyghe

The Romanisation is based on the BGN/PCGN system (2012), with the following changes: ⟪е⟫ is Romanised as ⟨ye⟩; ⟪у⟫ as ⟨w⟩ if it is pronounced /w/ or if it labialises the preceding consonant, otherwise as ⟨u⟩; ⟪хъ⟫ as ⟨k͟h⟩; and ⟪э⟫ as ⟨e⟩. The apostrophe after ⟨q⟩ is omitted.

Buryat

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions and changes: ⟪е⟫ → ⟨ye⟩ (always); ⟪ё⟫ → ⟨yo⟩; ⟪ө⟫ → ⟨ö⟩; ⟪ү⟫ → ⟨ü⟩; ⟪һ⟫ → ⟨h⟩. The apostrophe after ⟨sh⟩ is omitted.

Church Slavonic

The Romanisation is based on the BGN/PCGN system for Bulgarian (1952), with the following additions: ⟪є⟫ → ⟨e⟩; ⟪ѕ⟫ → ⟨dz⟩; ⟪ї⟫ → ⟨i⟩; ⟪ѡ⟫ → ⟨o⟩; ⟪ꙋ⟫ → ⟨u⟩; ⟪ы⟫ → ⟨y⟩; ⟪ѣ⟫ → ⟨ě⟩; ⟪ѧ⟫ → ⟨ę⟩; ⟪ѳ⟫ → ⟨th⟩. The original diacritics and word-final hard yers are omitted.

Dravidian and Indo-Aryan languages

The Romanisation is based on the ISO 15919:2001 standard, with option 9.1 (which allows macrons over ⟨ē⟩ and ⟨ō⟩ to be omitted in transcriptions from the Bengali and Devanagari scripts) applied, along with the following additional changes:

  1. The inherent vowel is omitted if silent. In Bengali, it is otherwise written as ⟨ô⟩.
  2. Ligatures are transcribed phonetically, not according to their orthography. Additionally, ⟨ri⟩ is written instead of ⟨r̥⟩ to account for phonetics.
  3. Diacritics that do not affect pronunciation, and those on ⟨ā⟩ and ⟨ẏ⟩ in Bengali, are omitted.
  4. Once the previous changes have been applied, the following grapheme substitutions are made: ⟨c⟩ → ⟨ch⟩; ⟨ś⟩ → ⟨sh⟩; ⟨ṣ⟩ → ⟨ṣh⟩. In Bengali, ⟨s⟩ is also changed to ⟨sh⟩, except before a consonant.

Maldivian is transliterated according to the 2000 draft, except that alifu and shaviyani with sukun are Romanised as ⟨ḫ⟩ at the end of a word; otherwise, the following consonant is doubled. In all other cases, sukun is ignored. The dotted letters are transcribed according to the actual Maldivian pronunciation and the other changes listed above are also applied.

Urdu is Romanised as if it was first transliterated to Devanagari and then to the Latin alphabet.

English

Additional marks are used to indicate letters and groups of letters pronounced in an archaic or non-standard way: ⟨ā⟩, ⟨a͞i⟩ and ⟨e͟a⟩ for /eɪ/; ⟨ȧ⟩, ⟨ȧi⟩ and ⟨ė⟩ for /ɑː/; ⟨ạ⟩, ⟨ȯ⟩ and ⟨u̇⟩ for /ɔː/; ⟨ĕ⟩, ⟨e͝a⟩ and ⟨e͝i⟩ for /ɛ/; ⟨è⟩ for /ɪ/; ⟨ê⟩ and ⟨e᷍a⟩ for /ɛə/; ⟨e͞a⟩ for /iː/; ⟨e᷍u⟩ for /uː/; ⟨e͞w⟩ for /juː/; ⟨e͞y⟩, ⟨ī⟩, ⟨o͟i⟩, ⟨o͟y⟩ and ⟨ȳ⟩ for /aɪ/; ⟨ŏ⟩ and ⟨ọu⟩ for /ɒ/; ⟨ō⟩ and ⟨o͟o⟩ for /oʊ/; ⟨ọ⟩, ⟨ọo⟩ and ⟨ŭ⟩ for /ʌ/; ⟨o͝o⟩ and ⟨ụ⟩ for /ʊ/; ⟨o͞u⟩ for /aʊ/; ⟨s̱⟩ for /z/; and ⟨ṯ⟩ for /t/. Diphthongs followed by /r/ without an intervening /ə/ are overlined, e.g. fi͞re /faɪr/, ho͞u͞r /aʊr/, unless the omission of /ə/ is already indicated by an apostrophe, as in pow’r /paʊr/.

Ingush

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions and changes: ⟪а⟫ → ⟨ă⟩ when reduced or ⟨ª⟩ when elided; ⟪аь⟫ → ⟨ea⟩; ⟪в⟫ not followed by a vowel → ⟨w⟩; ⟪гӀ⟫ → ⟨gh⟩; ⟪кх⟫ → ⟨q⟩; ⟪ккх⟫ → ⟨qq⟩; ⟪къ⟫ → ⟨q’⟩; ⟪хь⟫ → ⟨ẖ⟩; ⟪хӀ⟫ → ⟨h⟩; ⟪ь⟫ → omitted; reduced ⟪я⟫ → ⟨yə⟩; ⟪яь⟫ → ⟨yea⟩; ⟪Ӏ⟫ → ⟨’⟩.

Komi and Mari

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions: ⟪ä⟫ → ⟨ä⟩; ⟪ҥ⟫ → ⟨ng⟩; ⟪ӧ⟫ → ⟨ö⟩; ⟪ӱ⟫ → ⟨ü⟩; ⟪ӹ⟫ → ⟨ÿ⟩.

Mordvinic languages

The Romanisation is based on the BGN/PCGN system for Russian (1947), except that after alveolar consonants, ⟪е⟫ is Romanised as ⟨ye⟩ and ⟪э⟫ as ⟨e⟩, without an interpunct.

Tibetan

The Romanisation is based on THL Simplified Phonetic Transcription of Standard Tibetan (2010), with the following changes:

  1. Aspirated consonants are always written with an ⟨h⟩, similar to the Wylie system. This includes consonants in low-tone syllables, which are written as voiced in the original THL system. The non-aspirated /tɕ/ sound is written as ⟨c⟩.
  2. ⟨s⟩ is used instead of ⟨z⟩, and ⟨sh⟩ instead of ⟨zh⟩.
  3. The open /ɛ/ sound is written as ⟨ä⟩.
  4. The acute accent over the ⟨e⟩ is omitted.

‹ back to Pauline’s MIDI collection