|
TRANSCRIPTION PROTOCOLS |
O.T. FORD UPDATED 2007 AUGUST 7
The language usage here advocated and employed is based on the belief that, particularly in English with its strengthening global dominance, inclusion of other linguistic traditions is important for cultural heritage and for the facilitation of intercultural understanding. The first principle is that all names, of persons, places, organizations, and concepts, should be left in the original form wherever possible. The main Hellenistic alphabets ― Latin, Cyrillic, and Greek itself ― should be presented in the original alone. The assumption, in computerized communication, is that most computers are equipped with the requisite characters for these Hellenistic scripts. Virtually all names in the Latin scripts can be printed unaltered. Most names in the Cyrillic scripts can also be shown unaltered. Occasional characters or diacritical marks are unavailable, and these should be approximated rather than transliterated, with the remaining characters in a name left in the original form. Modern Greek can be shown in its native form; Ancient Greek should be presented in its classical form. But for other scripts no such assumption can be made, and names and words of non-English origin should be printed both in the original (where possible) and in transliterated or transcribed form. The transliteration or transcription employed here attempts to be regular, though mistakes have certainly been made, and corrections are welcome.
The second principle, where transliteration is required, is that it should be possible to reconstruct the original; the transliterated form should unambiguously represent the original writing. Therefore, while occasionally an original grapheme will be transliterated by more than one Latin symbol (primarily when a single grapheme represents different phonemes), only rarely (when unavoidable) will one Latin symbol represent more than one original grapheme.
There are several minor principles concerning transliteration. Unlike in the International Phonetic Alphabet and numerous transliteration schemes, capitals and lowercase should not be used distinctively, so that they may be used, following normal practice, as contextual variants of each other. That is, most letters used for transliteration can be written in capital or lowercase form, as appropriate. Where scripts are related, transliterations should also be related, as with the Semitic or Indian scripts below. And finally, borrowings should not be transliterated as written, but rather restored to or transliterated as the original.
Since 漢字 hàn zì is ideographic, it should merely be transcribed. For 漢語 Hàn yŭ, the best option is pīn yīn; but each syllable should be transcribed separately. If for other dialects written in 漢字 hàn zì there is a pīn yīn standard that reflects the native pronunciation, that should be favored. (This site uses Yale romanization for 廣話 Gwóng Wáa and 白話字 Pe̍h Ōe Jī for 閩南語 Bân Lâm Gú.)
Phoenician: א ’i ב B ג G ד D ה H ו Ŭ ז Z ח H ט T י J כ K ל L מ M נ N ס S ע C פ P צ S ק Q ר R ש Š ת T. No separate encoding for Phoenician exists; convention replaces it with modern Hebrew forms.
― Arabic: ا ’-Ā ب B ت T ث Þ ج Ĝ ح H خ X د D ذ Ð ر R ز Z س S ش Š ص S ض D ط T ظ Ð ع C غ Ğ ف F ق Q ك K ل L م M ن N ه H و Ŭ-Ū ى J-Ī (ة Ĥ). The basic unrecorded vowel structure has been transcribed as A-I-U. Related scripts have been transliterated along the same principles but will include additional characters (ک K گ G ژ Ž پ P چ Č).
― Hebrew: א ’i ב B ג G ד D ה H ו Ŭ-Ū ז Z ח H ט T י J-Ī כ K ל L מ M נ N ס S ע C פ P צ S ק Q ר R ש Š-Ś ת T. A broader range of vowels has been transcribed.
― Syriac: ܐ ’-Ā ܒ B ܓ G ܕ D ܗ H ܘ Ŭ-Ū ܙ Z ܚ H ܛ T ܝ J-Ī ܟ K ܠ L ܡ M ܢ N ܣ S ܥ C ܦ P ܨ S ܩ Q ܪ R ܫ Š ܬ T.
― Amharic/Ethiopic: አ Ä ኡ U ኢ I ኣ A ኤ E እ - ኦ O ሀ H ለ L ሐ H መ M ሠ Ś ረ R ሰ S ሸ Š ቀ Q ቈ QŬ በ B ቨ V ተ T ቸ Č ኀ Ĥ ኈ ĤŬ ነ N ኘ Ñ አ ’ ከ K ኰ KŬ ኸ X ዀ XŬ ወ Ŭ ዐ C ዘ Z ዠ Ž የ J ደ D ጀ Ž ገ G ጐ GŬ ጠ T ጨ Č ጰ P ጸ S ፀ D ፈ F ፐ P ጘ Ŋ ⶓ ŊŬ
Dēvanāgarī: अ A आ Ā इ I ई Ī उ U ऊ Ū ऋ Ŗ ए Ē ऐ Æ ओ Ō औ Å क K ख K‛ ग G घ G‛ ङ Ŋ च C छ C‛ ज J झ J‛ ञ Ñ ट T ठ T‛ ड D ढ D‛ ण N त T थ T‛ द D ध D‛ न N प P फ P‛ ब B भ B‛ म M य J र R ल L व V श Ś ष S स S ह H ळ L क़ Q ख़ X ग़ Ğ ज़ Z फ़ F
― Bengali: অ A আ Ā ই I ঈ Ī উ U ঊ Ū ঋ Ŗ এ Ē ঐ Æ ও Ō ঔ Å ক K খ K‛ গ G ঘ G‛ ঙ-ং Ŋ চ C ছ C‛ জ J ঝ J‛ ঞ Ñ ট T ঠ T‛ ড D ঢ D‛ ণ N ত T থ T‛ দ D ধ D‛ ন N প P ফ P‛ ব B ভ B‛ ম M য J য় J-Ŭ র R ল L শ Ś ষ S স S হ H ড় R ঢ় R‛ ঃ ’ ঁ ~
― Gurmukhi/Punjabi: ਅ A ਆ Ā ਇ I ਈ Ī ਉ U ਊ Ū ਏ Ē ਐ Æ ਓ Ō ਔ Å ਕ K ਖ K‛ ਗ G ਘ G‛ ਙ Ŋ ਚ C ਛ C‛ ਜ J ਝ J‛ ਞ Ñ ਟ T ਠ T‛ ਡ D ਢ D‛ ਣ N ਤ T ਥ T‛ ਦ D ਧ D‛ ਨ N ਪ P ਫ P‛ ਬ B ਭ B‛ ਮ M ਯ J ਰ R ਲ L ਵ V ਸ S ਹ H ਸ਼ Š ਲ਼ L ਖ਼ X ਗ਼ Ğ ਜ਼ Z ੜ R ਫ਼ F ੰ ~ ਂ ~ ੱ :
― Gujarati: અ A આ Ā ઇ I ઈ Ī ઉ U ઊ Ū ઋ Ŗ ૠ Ŗ: ઍ E એ Ē ઐ Æ ઑ O ઓ Ō ઔ Å ક K ખ K‛ ગ G ઘ G‛ ઙ Ŋ ચ C છ C‛ જ J ઝ J‛ ઞ Ñ ટ T ઠ T‛ ડ D ઢ D‛ ણ N ત T થ T‛ દ D ધ D‛ ન N પ P ફ P‛ બ B ભ B‛ મ M ય J ર R લ L ળ L વ V શ Ś ષ S સ S હ H ં ~ ઃ ’
― Sinhalese: අ A ආ Ā ඇ E ඈ Ē ඉ I ඊ Ī උ U උ Ū ඍ Ŗ ඎ Ŗ: ඏ Ļ ඐ Ļ: එ E ඒ Ē ඓ Æ ඔ O ඕ Ō ඖ Å ක K ඛ K‛ ග G ඝ G‛ ඞ Ŋ ඟ ŊG ච C ඡ C‛ ජ J ඣ J‛ ඤ Ñ ඦ ÑJ ට T ඨ T‛ ඩ D ඪ D‛ ණ N ඬ ND ත T ථ T‛ ද D ධ D‛ න N ඳ ND ප P ඵ P‛ බ B භ B‛ ම M ඹ MB ය J ර R ල L ව V ශ Ś ෂ S ස S හ H ළ L ෆ F ං ~ ඃ ’
― Oriya: ଅ A ଆ Ā ଇ I ଈ Ī ଉ U ଊ Ū ଋ Ŗ ୠ Ŗ: ଌ Ļ ୡ Ļ: ଏ Ē ଐ Æ ଓ Ō ଔ Å କ K ଖ K‛ ଗ G ଘ G‛ ଙ Ŋ ଚ C ଛ C‛ ଜ J ଝ J‛ ଞ Ñ ଟ T ଠ T‛ ଡ D ଢ D‛ ଣ N ତ T ଥ T‛ ଦ D ଧ D‛ ନ N ପ P ଫ P‛ ବ B ଭ B‛ ମ M ୟ J ଯ Ž ର R ଲ L ଳ L ଶ Ś ଷ S ସ S ହ H ଡ଼ R ଢ଼ R‛ ଁ ~ ଃ ’.
― Malayalam: അ A ആ Ā ഇ I ഈ Ī ഉ U ഊ Ū ഋ Ŗ എ E ഏ Ē ഐ Æ ഒ O ഓ Ō ഔ Å ക K ഖ K‛ ഗ G ഘ G‛ ങ Ŋ ച C ഛ C‛ ജ J ഝ J‛ ഞ Ñ ട T ഠ T‛ ഡ D ഢ D‛ ണ N ത T ഥ T‛ ദ D ധ D‛ ന N പ P ഫ P‛ ബ B ഭ B‛ മ M യ J ര R ല L വ V ശ Ś ഷ S സ S ഹ H ള L ഴ L റ R ം ~ ഃ ’
― Kannada: ಅ A ಆ Ā ಇ I ಈ Ī ಉ U ಊ Ū ಋ Ŗ ೠ Ŗ: ಌ Ļ ೡ Ļ: ಎ E ಏ Ē ಐ Æ ಒ O ಓ Ō ಔ Å ಕ K ಖ K‛ ಗ G ಘ G‛ ಙ Ŋ ಚ C ಛ C‛ ಜ J ಝ J‛ ಞ Ñ ಟ T ಠ T‛ ಡ D ಢ D‛ ಣ N ತ T ಥ T‛ ದ D ಧ D‛ ನ N ಪ P ಫ P‛ ಬ B ಭ B‛ ಮ M ಯ J ರ R ಱ R ಲ L ಳ L ವ V ಶ Ś ಷ S ಸ S ಹ H ೞ F ಂ ~ ಃ ’
― Telugu: అ A ఆ Ā ఇ I ఈ Ī ఉ U ఊ Ū ఋ Ŗ ౠ Ŗ: ఌ Ļ ౡ Ļ: ఎ E ఏ Ē ఐ Æ ఒ O ఓ Ō ఔ Å క K ఖ K‛ గ G ఘ G‛ ఙ Ŋ చ C ఛ C‛ జ J ఝ J‛ ఞ Ñ ట T ఠ T‛ డ D ఢ D‛ ణ N త T థ T‛ ద D ధ D‛ న N ప P ఫ P‛ బ B భ B‛ మ M య J ర R ఱ R ల L ళ L వ V శ Ś ష S స S హ H ం ~ ః ’
― Tamil: அ A ஆ Ā இ I ஈ Ī உ U ஊ Ū எ E ஏ Ē ஐ Æ ஒ O ஓ Ō ஔ Å க K ங Ŋ ச C ஞ Ñ ட T ண N த T ந N ப P ம M ய J ர R ல L வ V ழ Z ள L ற Ř ன Ň (ஜ J ஷ Ś ஸ S ஹ H)
― Thai: ก K ข K‛ ฃ K‛´ ค G ฅ G´ ฆ G‛ ง Ŋ จ C ฉ C‛ ช J ซ J´ ฌ J‛ ญ Ñ ฎ 'D ฏ T ฐ T‛ ฑ D ฒ D‛ ณ N ด 'D ต T ถ T‛ ท D ธ D‛ น N บ 'B ป P ผ P‛ ฝ P‛´ พ B ฟ B´ ภ B‛ ม M ย J ร R ฤ Ŗ ล L ฦ Ļ ว Ŭ ศ Ś ษ S ส S ห H ฬ L อ ’ ฮ H
― Burmese: က K ခ K‛ ဂ G ဃ G‛ င Ŋ စ C ဆ C‛ ဇ J ဈ J‛ ဉ/ည Ñ ဋ T ဌ T‛ ဍ D ဎ D‛ ဏ N တ T ထ T‛ ဒ D ဓ D‛ န N ပ P ဖ P‛ ဗ B ဘ B‛ မ M ယ J ရ R လ L ဝ Ŭ သ S ဟ H ဠ L အ ’.
― Tibetan: ཨ A ཨི I ཨུ U ཨེ E ཨོ O ཀ K ཁ K‛ ག G ང Ŋ ཅ C ཆ C‛ ཇ J ཉ Ñ ཏ T ཐ T‛ ད D ན N པ P ཕ P‛ བ B མ M ཙ S ཚ S‛ ཛ Z ཝ Ŭ ཞ Ž ཟ Z འ ’ ཡ J ར R ལ L ཤ Ś ས S ཧ H (ཊ T ཋ T‛ ཌ D ཎ N ཥ S).
Japanese: ア A イ I ウ U エ E オ O カ K ガ G サ S ザ Z タ T ダ D ナ N ハ H バ B パ P マ M ヤ J ラ R ワ Ŭ. Transliteration of kana is direct, including for long vowels, based on a phonemic transcription. ( tu ) stands for small ッ TU when marking a geminate consonant, again corresponding to kana, as well as extended pronunciation. Kanji are transcribed phonemically, as kana above.
Korean: ᅡ A ᅢ Æ ᅣ JA ᅤ JÆ ᅥ Ə ᅦ E ᅧ JƏ ᅨ JE ᅩ O ᅭ JO ᅮ U ᅲ JU ᅳ U ᅵ I ᄀ K ᄂ N ᄃ T ᄅ L ᄆ M ᄇ P ᄉ S ᄋ -/Ŋ ᄌ C ᄎ C‛ ᄏ K‛ ᄐ T‛ ᄑ P‛ ᄒ H. Since the han kul is always transliterated with cluster (syllable) breaks, the elements of the han kul are transliterated separately, yielding ᅪ OA ᅫ OÆ ᅬ OI ᅯ UƏ ᅰ UE ᅱ UI ᅴ UI ᄁ KK ᄄ TT ᄈ PP ᄊ SS ᄍ CC, noting that ᅬ OI is rendered [ø] or [we], and ᅱ UI [y] or [wi].
Armenian: Աա A Բբ B Գգ G Դդ D Եե E Զզ Z Էէ Ē Ըը Ə Թթ T‛ Ժժ Ž Իի I Լլ L Խխ X Ծծ C Կկ K Հհ H Ձձ Z Ղղ Ğ Ճճ Č Մմ M Յյ J Նն N Շշ Š Ոո O Չչ Č‛ Պպ P Ջջ Ž Ռռ Ř Սս S Վվ V Տտ T Րր R Ցց C‛ Ււ U Փփ P‛ Քք K‛ Օօ Ō Ֆֆ F և eu. The eastern pronunciation has been used as a standard.
Georgian: ა A ბ B გ G დ D ე E ვ V ზ Z თ
T‛ ი I კ K ლ L მ M ნ N ო O პ P ჟ Ž რ R ს S ტ T უ U ფ P‛ ქ K‛ ღ Ğ ყ Q შ Š ჩ Č‛ ც C‛ ძ Z წ C ჭ Č ხ X ჯ Ž ჰ H ჱ Ē ჲ J ჳ ŬI ჴ Q‛ ჵ Ō ჶ F.
Other conventions used across systems: ( ‛ ) indicates aspiration, ( ~ ) nasalization, and ( ’ ) glottal stop; ( J ) represents [j], ( Ŭ ) represents [w], ( X ) represents [x], ( Z ) represents [dz], ( Ž ) represents [dʒ], ( Þ ) represents [θ], and ( Ð ) represents [ð]; due to typographical restrictions, underscoring ( _ ) replaces a subscript dot.
In contexts where the Hellenistic scripts are transliterated, the following system is used:
Greek: Α-A Β-B Γ-G/Ŋ Δ-D Ε-E Ζ-Z Η-Ē Θ-T' Ι-I Κ-K Λ-L Μ-M Ν-N Ξ-X Ο-O Π-P Ρ-R Σ-S Τ-T Υ-U Φ-P' Χ-K' Ψ-PS Ω-Ō
― Cyrillic: А-A Б-B В-V Г-G Ґ-Ġ Д-D Ђ-Ð Ѓ-G’ Е-E Ё-JO Є-É Ж-Ž Ѕ-Z З-Z Ӡӡ-Z И-I Ѳ-Þ І-Í Ї-JÍ Й-Ĭ Ј-J К-K Ҝ-Ġ Л-L Љ-L’ М-M Н-N Њ-N’ Ң-Ŋ Ѯ-Ks О-O Ө-Ö П-P Р-R С-S Т-T Ћ-Ć Ќ-K’ У-U Ў-Ŭ Ұ-U Ү-Ü Ф-F Х-X Һ-H Ѱ-Ps Ѡ-Ō Ц-C Ч-Č Ҹ-Ž Џ-Ž Ш-Š Щ-Ŝ Ъ-Ə/ə Ы-Y Ь-’ Ѣ-Ě Э-È Ю-JU Я-JA Ѧ-Ę Ѩ-JĘ Ѫ-Õ Ѭ-JÕ Ѵ-Ý Ѥ-JĒ Ғ-Ğ Ӣ-Ī Қ-Q Ӯ-Ū Ҳ-H Ҷ-Ž.
The stewardship site uses numerical Unicode, and also UTF-8 encoding, which allows the space-efficient encoding of all Unicode characters (subject to font availability on the target computer). To see the latter pages properly, the encoding on the internet browser must correspond; if the browser shows repeated boxes, it is likely not set for UTF-8. Under View/Font or View/Encoding, choose Auto(matically) Select (the best choice, as it will not interfere with the viewing of other codes); or set manually to UTF-8. Most of the characters used in this site are available in Unicode fonts. In all cases of unfamiliar (non-English) words, a translation into the standard English form will appear if the cursor is placed over the word. In many cases, I have been forced to use the only form available to me, already transliterated. In some cases, I have even retransliterated into what I suppose to be the original. Corrections to these usages are sought, and should be sent to the Stewardship Project, project
the-stewardship.org.
|SITE MAP|STEWARDSHIP|UNION|PROJECT|EARTH|POLICIES|ESSAYS|RESEARCH|