Linguistic notes

Wherever possible, modern orthography has been used for names, titles and lyrics. For languages without a widely accepted Latin orthography, the following Romanisation systems have been employed:

Language(s)	Romanisation system
Abkhaz	BGN/PCGN (2011)
Adyghe	own Romanisation, see below
Amharic	BGN/PCGN (1967)
Arabic	EI3 (2007), with modifications¹
Armenian	BGN/PCGN (1981)
Bashkort	national (2007 or earlier)
Bulgarian	BGN/PCGN (1952)
Burmese	Okell (1969), with modifications²
Buryat	own Romanisation, see below
Cantonese	Yale (1958)
Church Slavonic	own Romanisation, see below
Dravidian and Indo-Aryan languages	own Romanisation, see below
Dzongkha	van Driem (2019)
Georgian	BGN/PCGN (1981)
Greek	ELOT 743 Type 2 (2001)
Hebrew	Academy of the Hebrew Language (2006)³
Ingush	own Romanisation, see below
Japanese	Hepburn (1908)
Khmer	BGN/PCGN (1972)
Kildin Sámi	Rießler (2022)
Komi	own Romanisation, see below
Korean	McCune–Reischauer (1939)
Lao	CNT (1966 or earlier)
Mandarin	Hànyǔ Pīnyīn (1958)
Mari	own Romanisation, see below
Mongolian	BGN/PCGN (1964)
Mordvinic languages	own Romanisation, see below
Ossetian	BGN/PCGN (2009)
Pashto	BGN/PCGN (2017), with modifications⁴
Persian (including Dari and Tajik)	UniPers (2003)⁵
Russian	BGN/PCGN (1947)⁶
Syriac	BGN/PCGN (2011)
Tatar	national (2012)
Thai	RTGS (1999), with modifications⁷
Tibetan	own Romanisation, see below
Tigrinya	BGN/PCGN (2007)
Udmurt	BGN/PCGN (2011)
Ukrainian	BGN/PCGN (1965)
Yiddish	YIVO
other Turkic languages	Common Turkic Alphabet (2024)

Apostrophes are used instead of half-rings, and all sun letter assimilations are applied.
The voicing of consonants is indicated according to their actual pronunciation.
If a transcription based on Ashkenazi pronunciation is required, tsere is Romanised as ⟨ey⟩; kamats gadol as ⟨o⟩; ẖolam as ⟨ô⟩; and ungeminated tav as ⟨s⟩.
The transcription has been simplified by omitting silent consonants and diacritics that do not affect the pronunciation.
The following extensions are used for Dari and Tajik: ⟨ğ⟩ for the sound /ɣ~ʁ/ (distinct from /q/); ⟨î û⟩ for the long vowels /iː uː/ in Dari; and ⟨ü⟩ for the phoneme /ʉ/ in Tajik.
If an unambiguous representation of the pre-revolutionary orthography is required, ⟪і⟫ is romanised as ⟨ī⟩ and ⟪ѣ⟫ as ⟨ě⟩.
The following extensions have been taken from the 1939 version of the system: ⟨čh⟩ for unaspirated /tɕ/; and ⟨ǫ⟩ for the open vowel /ɔ/.

Adyghe

The Romanisation is based on the BGN/PCGN system (2012), with the following changes: ⟪е⟫ is Romanised as ⟨ye⟩; ⟪у⟫ as ⟨w⟩ if it is pronounced /w/ or if it labialises the preceding consonant, otherwise as ⟨u⟩; ⟪хъ⟫ as ⟨k͟h⟩; and ⟪э⟫ as ⟨e⟩. The apostrophe after ⟨q⟩ is omitted.

Buryat

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions and changes: ⟪е⟫ → ⟨ye⟩ (always); ⟪ё⟫ → ⟨yo⟩; ⟪ө⟫ → ⟨ö⟩; ⟪ү⟫ → ⟨ü⟩; ⟪һ⟫ → ⟨h⟩. The apostrophe after ⟨sh⟩ is omitted.

Church Slavonic

The Romanisation is based on the BGN/PCGN system for Bulgarian (1952), with the following additions: ⟪є⟫ → ⟨e⟩; ⟪ѕ⟫ → ⟨dz⟩; ⟪ї⟫ → ⟨i⟩; ⟪ѡ⟫ → ⟨o⟩; ⟪ꙋ⟫ → ⟨u⟩; ⟪ы⟫ → ⟨y⟩; ⟪ѣ⟫ → ⟨ě⟩; ⟪ѧ⟫ → ⟨ę⟩; ⟪ѳ⟫ → ⟨th⟩. The original diacritics and word-final hard yers are omitted.

Dravidian and Indo-Aryan languages

The Romanisation is based on the ISO 15919:2001 standard, with option 9.1 (which allows macrons over ⟨ē⟩ and ⟨ō⟩ to be omitted in transcriptions from the Bengali and Devanagari scripts) applied, along with the following additional changes:

The inherent vowel is omitted if silent. In Bengali, it is otherwise written as ⟨ô⟩.
Ligatures are transcribed phonetically, not according to their orthography. Additionally, ⟨ri⟩ is written instead of ⟨r̥⟩ to account for phonetics.
Diacritics that do not affect pronunciation, and those on ⟨ā⟩ and ⟨ẏ⟩ in Bengali, are omitted.
Once the previous changes have been applied, the following grapheme substitutions are made: ⟨c⟩ → ⟨ch⟩; ⟨ś⟩ → ⟨sh⟩; ⟨ṣ⟩ → ⟨ṣh⟩. In Bengali, ⟨s⟩ is also changed to ⟨sh⟩, except before a consonant.

Maldivian is transliterated according to the 2000 draft, except that alifu and shaviyani with sukun are Romanised as ⟨ḫ⟩ at the end of a word; otherwise, the following consonant is doubled. In all other cases, sukun is ignored. The dotted letters are transcribed according to the actual Maldivian pronunciation and the other changes listed above are also applied.

Urdu is Romanised as if it was first transliterated to Devanagari and then to the Latin alphabet.

English

Additional marks are used to indicate letters and groups of letters pronounced in an archaic or non-standard way: ⟨a̯⟩ and ⟨e̯⟩ for /ə/; ⟨ā⟩, ⟨a͞i⟩, ⟨e͟a⟩ and ⟨e͟i⟩ for /eɪ/; ⟨ȧ⟩, ⟨ȧi⟩ and ⟨ė⟩ for /ɑː/; ⟨ạ⟩, ⟨ȯ⟩ and ⟨u̇⟩ for /ɔː/; ⟨ĕ⟩, ⟨e͝a⟩ and ⟨e͝i⟩ for /ɛ/; ⟨è⟩ for /ɪ/; ⟨ê⟩ and ⟨e᷍a⟩ for /ɛə/; ⟨e͞a⟩ for /iː/; ⟨e᷍u⟩ for /uː/; ⟨e͞w⟩ for /juː/; ⟨e͞y⟩, ⟨ī⟩, ⟨i͞e⟩, ⟨o͟i⟩, ⟨o͟y⟩ and ⟨ȳ⟩ for /aɪ/; ⟨ŏ⟩ and ⟨ọu⟩ for /ɒ/; ⟨ō⟩, ⟨o͟o⟩ and ⟨o͟w⟩ for /oʊ/; ⟨ọ⟩, ⟨ọo⟩ and ⟨ŭ⟩ for /ʌ/; ⟨o͝o⟩ and ⟨ụ⟩ for /ʊ/; ⟨o͞u⟩ for /aʊ/; ⟨s̱⟩ for /z/; and ⟨ṯ⟩ for /t/. Diphthongs followed by /r/ without an intervening /ə/ are overlined, e.g. fi͞re /faɪr/, ho͞u͞r /aʊr/, unless the omission of /ə/ is already indicated by an apostrophe, as in pow’r /paʊr/.

Ingush

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions and changes: ⟪а⟫ → ⟨ă⟩ when reduced or ⟨ª⟩ when elided; ⟪аь⟫ → ⟨ea⟩; ⟪в⟫ not followed by a vowel → ⟨w⟩; ⟪гӀ⟫ → ⟨gh⟩; ⟪кх⟫ → ⟨q⟩; ⟪ккх⟫ → ⟨qq⟩; ⟪къ⟫ → ⟨q’⟩; ⟪хь⟫ → ⟨ẖ⟩; ⟪хӀ⟫ → ⟨h⟩; ⟪ь⟫ → omitted; reduced ⟪я⟫ → ⟨yə⟩; ⟪яь⟫ → ⟨yea⟩; ⟪Ӏ⟫ → ⟨’⟩.

Komi and Mari

The Romanisation is based on the BGN/PCGN system for Russian (1947), with the following additions: ⟪ä⟫ → ⟨ä⟩; ⟪ҥ⟫ → ⟨ng⟩; ⟪ӧ⟫ → ⟨ö⟩; ⟪ӱ⟫ → ⟨ü⟩; ⟪ӹ⟫ → ⟨ÿ⟩.

Mordvinic languages

The Romanisation is based on the BGN/PCGN system for Russian (1947), except that after alveolar consonants, ⟪е⟫ is Romanised as ⟨ye⟩ and ⟪э⟫ as ⟨e⟩, without an interpunct.

Tibetan

The Romanisation is based on THL Simplified Phonetic Transcription of Standard Tibetan (2010), with the following changes:

Aspirated consonants are always written with an ⟨h⟩, similar to the Wylie system. This includes consonants in low-tone syllables, which are written as voiced in the original THL system. The non-aspirated /tɕ/ sound is written as ⟨c⟩.
⟨s⟩ is used instead of ⟨z⟩, and ⟨sh⟩ instead of ⟨zh⟩.
The open /ɛ/ sound is written as ⟨ä⟩.
The acute accent over the ⟨e⟩ is omitted.

‹ back to Pauline’s MIDI collection