Wiktionary:Persian transliteration

English Wikipedia has an article on:
Wikipedia

These are the rules concerning transliteration in Persian entries.

Three transliteration schemes are used in Persian entries (See below). Among them, the Tajik transliteration should only appear in a pronunciation template. Within entries otherwise, only the Classical and Iranian Transliterations should be used. While editors discussed and supported the idea of streamlining transliterations by inputting classical transliteration in all templates and automatically generating the Iranian transliteration (thus displaying both with one input), this has not been implemented yet; Due to the fact that the necessary modules to support generating two transliterations are still works-in-progress. Until said modules are available to streamline Persian transliteration, the transliterations used are as so:

In pronunciation templates: Pronunciation templates must always use the classical transliteration exclusively. As it is not possible to generate Classical, Dari and Tajik romanizations from the Iranian romanization. If the term is exclusive to Iranian Persian, still input the classical romanization, the pronunciation module fa-IPA can generate all romanizations from the classical romanization. Simply set the module to generate an Iranian transliteration.

In link and headerword templates: after streamlining, link & headword templates will also be able to generate both the Iranian and Classical romanization as fa-IPA does. In the mean time, there is no consensus regarding what transliteration to use. The most common practice is to use a modified Iranian transliteration, with q and ğ distinguished.

Within quotation and conjugation templates: all modern varieties spoken in Iran after the 14-16th centuries should use the Iranian Transliteration. Otherwise all other varieties (including Medieval varieties spoken in Iran) should utilize the Classical transliteration.

After streamlining is complete, it will be possible to use the classical transliteration in all cases and have the Iranian transliteration automatically generated.

Classical Transliteration

edit

This transliteration should be used by any variety that is not modern Iranian Persian (i.e. Classical, Dari, Hazaragi and Tajik) for terms in the Arabic script. For Tajik terms in Cyrillic see the Tajik transliteration.

This is the transliteration used in all pronunciation templates.

Consonants

edit
No. Letter Name of letter Transcription IPA
1 ا alif ā, ∅(see below) /ʔ/, /aː/, [ɑː], /ʔ∅/ obsolete or nonstandard characters (∅), invalid IPA characters (///[]/∅)
1b آ ā, 'ā /ʔaː/, [ʔɑː] invalid IPA characters (//[])
2 ب b /b/
3 پ p /p/
4 ت t /t/
5 ث s /s/
6 ج jīm j /d͡ʒ/
7 چ čē č /t͡ʃ/
8 ح h /h/
9 خ x /x/
10 د dāl d /d/
11 ذ zāl z /z/
12 ر r /r/
13 ز z /z/
14 ژ žē ž /ʒ/
15 س sīn s /s/
16 ش šīn š /ʃ/
17 ص sād or swād s /s/
18 ض zād or zwād z /z/
19 ط or tōy t /t/
20 ظ or zōy z /z/
21 ع 'ayn ' /ʔ/
22 غ ğayn ğ /ɣ/
23 ف f /f/
24 ق qāf q /q/
25 ک kāf k /k/
26 گ gāf g /ɡ/
27 ل lām l /l/
28 م mīm m /m/
29 ن nūn n /n/, [n], [ŋ], [ɴ] invalid IPA characters (//[][][])
30 و wāw (ma'rūf) w, ū /w/, /uː/ invalid IPA characters (//)
wāw (majhūl) ō /oː/
31 ه h, ∅(see below) /h/
32 ی (ma'rūf) y, ī /j/, /iː/ invalid IPA characters (//)
(majhūl) ē /eː/
0 ء hamza ', – /ʔ/
  • Dental, labial and velar stops are always aspirated, with few exceptions.
  • /t/ and /d/ are phonetically dental in nearly all varieties. Though they are also phonemically dental in Hazaragi and some other varieties of Dari.
  • Alif is, with few exceptions, only a glottal stop in word initial positions
  • Geminated consonants are shown with the consonant diacritic tašdīd (ـّ). Geminated consonants are transliterated with doubled letters, both in IPA and in transliterations.
  • hē (ه) in word final positions may act as a placeholder for any short vowel instead of a consonant. Most commonly the short vowel /a/.

About retroflex forms

edit

Some regional dialects in Afghanistan, such as Hazaragi, have the retroflex consonants /ʈ/ and /ɖ/ as distinct phonemes. However, Hazaragi is treated by its speakers as a spoken form of Dari so it does not have a standardized written form. As these phonemes are not present in standard Dari, their standardized forms are always written with tē (ت) and dāl (د) respectively.

Occasionally the nonstandard characters ٹ, ڈ or ټ, ډ are used by Hazaragi speakers. But there is no consensus on how these forms should be treated.

Vowels

edit

The vocalization used by Classical Persian and Dari differs slightly from the vocalization used by Iranian Persian. The table below shows the vocalization used by Classical Persian and Dari.

Romanization IPA Final Medial Initial
a /a/ ـَ اَ
ā /aː/ [ɑː] invalid IPA characters (//[]) ـا,ـیٰ ـا آ
i /i/ ـِ اِ
ī /iː/ ـِى ـِیـ اِیـ
ē /eː/ ـی ـیـ ایـ
u /u/ ـُ اُ
ū /uː/ ـُو اُو
ō /oː/ ـو او
  • Word final short vowels are usually shown proceeding a ه ()
  • The diacritic zēr is often realized as [ɛ] in many nonstandard dialects outside of Kabul, such as the Herati dialect. However, it is /ɪ/ in standard pronunciation and in the Kabuli dialect.

Diphthongs

edit
Romanization IPA Final Medial Initial
ay /aj/ ـَىْ ـَیْـ اَیْـ
āy /aːj/ ـَاىْ ـَایْـ آیْـ
aw /aw/ ـَوْ اَوْ
āw /aːw/ ـَاوْ آوْ
ūy /uːj/ ـُوىْ ـُویْـ اُویْـ
ōy /oːj/ ـوىْ ـویْـ اویْـ
  • All diphthongs are interpreted as being a phonemic sequence of a vowel + semi-vowel. Subsequently, if there are two adjacent vowels, at least one should become a semi-vowel (Such that there should never be two Adjacent vowels.)

Vowel diacritics

edit
Vowel Name Transcription IPA Notes
  zabar a /a/, [ä]~[æ] invalid IPA characters (][)
  zēr i /i/ May also be called zēr-i ma'rūf
  zēr-i majhūl [e̞] Only appears before glottal consonants, technically an allophone.
  pēš u /u/ May also be called pēš-i ma'rūf
  pēš-i majhūl [o̞] Only appears before glottal consonants, technically an allophone.
  jazm N/A none Vowel killer / zero-vowel diacritic.
  • before a word-final ه, ma'rūf and majhūl diacritics are not clearly distinguished.
  • Though short vowels also have ma'rūf-majhūl variants, majhūl short vowels likely will not be included in romanizations. Unlike the majhūl long vowels, which can appear anywhere, the majhūl short vowels only appear before glottal consonants, and are technically allophones.

Additional information

edit
  1. ـًا, ـاً, ءً (always word-final) – an
    For modern varieties ـاً is preferred. For classical Persian, both ـًا and ـاً are acceptable.
  2. All forms of hamza, including ء, ؤ and ئ are transliterated as '
  3. The (izāfa) vowel is transcribed differently depending on context:
    • ـِ (always word-final, after consonants) – -i
    • یِ (after the long vowels ا (ā) or و (ū, ō)) - -yi
      آفریقایِ جنوبیāfrīqā-yi janūbīSouth Africa
    • یِّ (always word-final after ی) - ī-yi
      جَمْهُوریِّ کورِیاjamhūrī-yi kōriyāRepublic of Korea
    • ـهٔ, the form ـه‌ی is treated as a variant – a-yi.
      خانهٔ کلان (spelled with a hamza diacritic)xāna-yi kalāna big house
      In spoken Dari هٔ may reduce to a short e (or i). So خانهٔ کلان may be pronounced as xāne kalān. It is not known if this will be included since this is the only Izāfa vowel with which this occurs.
    • خانه‌ی کلان (spelled with a a non-connecting ye)xāna-yi kalāna big house
  4. ـِیِّـ is transliterated as iyyi, the only exception being for Izāfa when it is transliterated as ī-yi.
  5. ـّ (tašdīd) – geminate consonant (Arabic shadda)
  6. Al- assimilation الـ
    • Only occurs in loaned compound terms from Arabic, as the article الـ is typically dropped from the lemma form of all Arabic loanwords.
    • if الـ is followed by one of the 'sun letters' of Arabic, the lām ل will assimilate with the following letter.
    • if الـ is part of a conjugation where the alif is silent, the alif is lam is transliterated as l- (or -l- if there is a ZWNJ)
      حَبْلُ المَتِین (hablu l-matīn)
      حَبْلُ‌المَتِین (hablu-l-matīn) (word with ZWNJ)
      فِالْحَال (fi-l-hāl) (example within a single word)
  7. ـه - when used as a colloquial copula in the 3rd person singular (he/she/it is) - -a (with a hyphen)
    تَهْران پایتَخْتِ ایرانه. (colloquial)ta(h)rān pāytaxt-i ērān-a.Tehran is the capital of Iran.
  8. ZWNJ – - (hyphen)
  9. Various governments of Afghanistan have recommended that the suffix ـگی have a space or ZWNJ when added to a word ending in ـه. This suggestion is not always observed, even in academic settings and by media broadcasters in Afghanistan. These spellings may be included as alternative forms.
    زنده‌گی (recommended spelling)zinda-gī
    زندگی (common spelling)zindagī

Iranian Transliteration

edit

This transliteration should be used for modern Iranian Persian, particularly varieties spoken in Iran after the 14th-16th centuries. Varieties of Iranian Persian spoken before the 14th-16th century should, with some exceptions, be treated as classical Persian.

Systems for Romanizing Persian
Persian Wiktionary IPA Others (dispreferred)
ا (word-initial) a, o, e (ʔ)æ, (ʔ)o, (ʔ)e
ا (other positions) â ɒː ā
آ â (word-initial)
'â (other positions)
(ʔ)ɒː (word-initial)
ʔɒː (other positions)
ʼā
ب b b
پ p p
ت t t
ث s s th, s̱, ṯ, s̄
ج j ǧ
چ č ch, c
ح h h ḥ, ḩ
خ x x kh, k͟h, ḫ, ḵ
د d d
ذ z z dh, d͟h, ẕ, ḏ
ر r r
ز z z
ژ ž ʒ zh, z͟h
س s s
ش š ʃ sh, s͟h
ص s s
ض z z ḍ, ż, ẕ
ط t t ṭ, ţ
ظ z z ẓ, z̧
ع ' ʔ, ː ʻ
غ ğ ɣ, ɢ q, gh, g͟h, ġ
ف f f
ق ğ ɣ, ɢ q, gh, g͟h, ḳ
ک k k
گ g g
ل l l
م m m
ن n n
و (consonant) v v w
و (long vowel) u, ô uː, oː ū, ō
و (diphthong[1]) ow ow au, aw
خوا (e.g. خواندن etc.) xɒː xwā-, khwā-
خوی (e.g. خوید etc.) xi xiː
ه (consonant) h (may appear in final position after a vowel, e.g. ده (dah)) h
ـه (word-final vowel) e e, æ eh, a, ah
ی (consonant) y j j
ی (long vowel) i, ê iː, eː ī, ē
ی (diphthong[2]) ey ej ai, ay
یٰ (always word-final) â ɒː ā, á
  1. ^ Unless bearing a shadda, in which case it is treated as consonant, e.g. اوّل (avval), not *owval
  2. ^ Unless bearing a shadda, in which case it is treated as consonant, e.g. ایّوب (ayyub), not *eyyub

Other symbols or combinations

edit
  1. ـاً (-an), ءً ('an) (always word-final) – an (The position of [fatHatan] is after the alef, not before, as is the current practice with Arabic)
  2. ء – ' (others: ʼ)
  3. ؤ – ' (others: ʼ)
  4. ئ – ' (others: ʼ)
  5. ـِ (-e) (ezâfe) (always word-final, unmarked in regular writing) – -e
  6. یِ (-ye) (ezâfe) (after long vowels ا (â) or و (u), unmarked in regular writing) - -ye
    آفْریقایِ جُنوبیâfriğâ-ye jonubiSouth Africa
  7. یِ (-ye) (ezâfe) (always word-final with ی (i), unmarked in regular writing) - i-ye
    جُمْهوریِ تاجیکِسْتانjomhuri-ye tâjikestânRepublic of Tajikistan
  8. ـهٔ (-h-ye) (U+0647 U+0654), sometimes written as ـه‌ی (-h-i) (always word-final) – e-ye. (Articles don't contain the hamze above "he", it is considered a diacritic and only used in the dsplay using |head=. Templates link to words without the hamze.)
    خانِهٔ بُزُرْگ (spelled with a hamze diacritic)xâne-ye bozorga big house
    خانِه‌ی بُزُرْگ (spelled with a a non-connecting ye)xâne-ye bozorga big house
  9. ـه‌ای (-h-i) - e-yi
  10. نه (no, not) - na (an exception)
  11. ـّ (tashdid) – geminate consonant (Arabic shadda)
  12. ـَ (-a) (fathe/zor) – a (Arabic fatha)
  13. ـِ (-e) (kasre/zir) – e (in modern Iranian, to check cases where it's "i") (Arabic kasra)
  14. ـُ (-o) (zamme/pish) – o (in modern Iranian, to check cases where it's "u") (Arabic damma). Used after consonants to make a short "o" sound. If used before و (o) produces a diphthong "ow":
    نُوْروز (nowruz)
  15. ـّ (shadda) – geminate consonant
  16. ـ۟ (sukūn/sokun) - marks absence of a vowel. Rarely used in popular Persian vocalisations, especially on final consonants. It may be necessary to use consistently in strict vocalisations to avoid any misreadings, allow automation and signalling that a word IS vocalised.
  17. ـه (-h) (in the word-final position after consonants for a large number of words) - e (no hyphen) (note with ezâfe the preferred spelling is ـهٔ (-h-ye))
    هَفْتِهhafteweek
  18. ـه (-h) - when used as a colloquial copula in the 3rd person singular (he/she/it is) - -e (with a hyphen)
    تِهْرون پایْتَخْتِ ایرونِه. (colloquial)tehrun pâytaxt-e irun-e.Tehran is the capital of Iran.
  19. ZWNJ – - (hyphen)
  20. The use of hyphens for etymological reasons - suffixes, compound words, etc. when no ZWNJ is used is to be discussed. E.g. currently plural form suffix ها () is transliterated as "-hâ" regardless if ZWNJ is present or not. (Apart from cases described above and ZWNJ, the use of hyphen is otherwise dispreferred. A space is transliterated as a space and the absence of space or ZWNJ is transliterated as nothing.)
    Below are transliteration examples to contrast the use of ZWNJ on connecting letters, space, nothing and non-connecting letters:
    ZWNJ on joining letters کتاب‌ها (ketâb-hâ) (plural of کتاب (ketâb))
    Space کتاب ها (ketâb hâ)
    Nothing (joining letters are connected) کتابها (ketâbhâ)
    Non-joining letters (no ZWNJ is used) اتوها (otuhâ) (plural of اتو (otu))

Arabic loanwords

edit
  1. ـة (-h) (always word-final) – a(t) (rare, only in unadapted borrowings from Arabic, normally adapted into Persian as ـت (at) or ـه (e))
  2. الـ - al (normally), can be "al-" (with a hyphen), if identified as the Arabic definite article. "l" can change to the following consonant if it's a "sun letter","a" can change to "o" (Arabic "u") in ezâfe, e.g.
    فارِغ‌ُالتَّحْصیل (fâreğo-t-tahsil) - here "l" is assimilated to "t" and "a" is changed to "o" following Arabic grammar rules.
  3. الـ - l or the next consonant (assimilated for Arabic "sun letters"). The alef is silent.
    بِٱلْفِعْل (belfe'l) from Arabic بِٱلْفِعْلِ (bi-l-fiʕli) where the alif is silent (أَلِف الوَصْل (ʔalif al-waṣl))


Tajik Transliteration

edit

This transliteration should be used for Tajik terms attested in the Cyrillic script and in the pronunciation section of Persian entries. Any Tajik terms attested in the Arabic script should be treated as Dari and use the PRS language code, which will generate the Classical Transliteration.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
А Б В Г Ғ Д Е Ё Ж З И Ӣ Й К Қ Л М Н
а б в г ғ д е ё ж з и ӣ й к қ л м н
а b v g ġ d e, ye yo ž z i, yi ī, yī y k q l m n
/a/ /b/ /v/ /ɡ/ /ʁ/ /d/ /e/,
/je/
/jɔː/,
/jɒː/
/ʒ/ /z/ /i/,
/iː/,
/ji/
/ˈiː/,
/ˈjiː/
/j/ /k/ /q/ /l/ /m/ /n/
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
О П Р С Т У Ӯ Ф Х Ҳ Ч Ҷ Ш Ъ Э Ю Я
о п р с т у ӯ ф х ҳ ч ҷ ш ъ э ю я
o p r s t u ü f x h č j š ʾ e yu ya
/ɔː/,
/ɒː/
/p/ /r/ /s/ /t/ /u/,
/uː/
/ɵ/ /f/ /χ/ /h/ /tʃ/ /dʒ/ /ʃ/ /ʔ/ /e/ /ju/,
/juː/
/ja/

See also

edit