From ce2cbbb619fac4bdf61ac25a5b080a25756b392c Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Wed, 15 Nov 2023 11:48:31 +0100 Subject: [PATCH 01/14] =?UTF-8?q?Update=20srr=5FLatn:=20=E1=B9=95=20in=20a?= =?UTF-8?q?uxiliary?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- Lib/gflanguages/data/languages/srr_Latn.textproto | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Lib/gflanguages/data/languages/srr_Latn.textproto b/Lib/gflanguages/data/languages/srr_Latn.textproto index 7b51ffd3..a05f6b59 100644 --- a/Lib/gflanguages/data/languages/srr_Latn.textproto +++ b/Lib/gflanguages/data/languages/srr_Latn.textproto @@ -8,6 +8,7 @@ region: "SN" exemplar_chars { base: "A B Ɓ C Ƈ D Ɗ E F G H I J K L M N Ñ Ŋ O P Ƥ Q R S T Ƭ U W X Y Ƴ a b ɓ c ƈ d ɗ e f g h i j k l m n ñ ŋ o p ƥ q r s t ƭ u w x y ƴ \'" marks: "◌̃" + auxiliary: "Ṕ ṕ" } sample_text { masthead_full: "WwIi" @@ -23,3 +24,5 @@ sample_text { specimen_21: "O xuu refna a wara was fa xalaat um a layin, waree jeg teen o njaaxid o leng, tam nuu te bugna ta nanel fat a layin kam saax le fa kataa saax.\nO kiin mu refna o ñoowaa kam ngentand um a ware aar a maakooyel no ñoow\'um no jeg um. Na aada um no ngentand um baa waag o andiɗ xooxum kam saax le.\nO xuu refna a wara o ñootnooxaa, a ñootin yiifum na ngap o mbodu yaa da njalna baa nger." specimen_16: "Oxuu refna a wara o gay kam saax le mbaat kam adna fee kaa taxna boo jam a jegaa kom ne yeegnit neene ɓistiiduuna.\nKe yeegnit neeke a ɓisiidna, waraand o ɓakand ni saax leng mbaat i mbokatoor, leng mbaat o kiino leng boo te tax ta fesooraa naa te warteerna ni ke warna jeg.\nNo andiɗ i jom le wiin fop a mbogna, fa ke warna den too te mbod teen keen fetu jam na adna fee,\nNo ñak o and fo o yeesandaa ke warna in, naa ɓisiidaa fitna, fa yiif a peƭaru no ñoow in too adna faynwiin we a layaa o ngalaat dan, taamaala matee den fo ñak keen refu ke wiin we a moƴ na o mbug no ñoow," } +source: "République du Sénégal, Décret no 2005-990 relatif à l’orthographe et la séparation des mots en seereer, 21 octobre 2005" +note: "A few documents use Ṕ ṕ instead of Ƥ ƥ." From a6baa608c90da1f8854c9f37ad2174158457bf23 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Wed, 15 Nov 2023 12:06:58 +0100 Subject: [PATCH 02/14] =?UTF-8?q?Update=20udu=5FLatn:=20note=20about=20?= =?UTF-8?q?=E1=B9=AF=E1=BA=96=20and=20t=CD=9Fh?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- Lib/gflanguages/data/languages/udu_Latn.textproto | 1 + 1 file changed, 1 insertion(+) diff --git a/Lib/gflanguages/data/languages/udu_Latn.textproto b/Lib/gflanguages/data/languages/udu_Latn.textproto index f102e876..063dbc75 100644 --- a/Lib/gflanguages/data/languages/udu_Latn.textproto +++ b/Lib/gflanguages/data/languages/udu_Latn.textproto @@ -25,3 +25,4 @@ sample_text { specimen_16: "Aris ’kwaniny’ceshi ’baar mo dho’thkunu ’baḵany mo dhali mmomiiya ṯu’c imonṯal ’de/ mo dhali mii ma ḵar/e mo. Uni mini ta gi gwo mo dhali mii mo dhali uni mini mii ka karambuye/ ’kup̱ ki cin tiya mo e shi/in mo dhali mii kun tanu ikam mo.\nAris ’kwaniny’ceshi ’baar mo dho’thkunu ’baḵany mo dhali mmomiiya ṯu’c imonṯal ’de/ mo dhali mii ma ḵar/e mo. Uni mini ta gi gwo mo dhali mii mo dhali uni mini mii ka karambuye/ ’kup̱ ki cin tiya mo e shi/in mo dhali mii kun tanu ikam mo.\nAris ’kwaniny’ceshi ’baar mo dho’thkunu ’baḵany mo dhali mmomiiya ṯu’c imonṯal ’de/ mo dhali mii ma ḵar/e mo. Uni mini ta gi gwo mo dhali mii mo dhali uni mini mii ka karambuye/ ’kup̱ ki cin tiya mo e shi/in mo dhali mii kun tanu ikam mo." } source: "Don Killian, Topics in Uduk Phonology and Morphosyntax, University of Helsinki, 2015" +note: "Some references use ṯẖ instead of t͟h." From 2bab81545186d7e2ad9b4bd5481d3904d34923d7 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Thu, 16 Nov 2023 08:47:05 +0100 Subject: [PATCH 03/14] Add ndv_Latn --- Lib/gflanguages/data/languages/ndv_Latn.textproto | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 Lib/gflanguages/data/languages/ndv_Latn.textproto diff --git a/Lib/gflanguages/data/languages/ndv_Latn.textproto b/Lib/gflanguages/data/languages/ndv_Latn.textproto new file mode 100644 index 00000000..a81c8fcb --- /dev/null +++ b/Lib/gflanguages/data/languages/ndv_Latn.textproto @@ -0,0 +1,13 @@ +id: "ndv_Latn" +language: "ndv" +script: "Latn" +name: "Ndut" +population: 60000 +region: "SN" +exemplar_chars { + base: "Ꞌ ꞌ A a {AA} {aa} B b Ɓ ɓ C c D d Ɗ ɗ E e {EE} {ee} É é {ÉE} {ée} Ë ë {ËE} {ëe} F f G g H h I i {II} {ii} Í í {ÍI} {íi} J j K k L l M m {MB} {mb} N n {ND} {nd} Ñ ñ {NJ} {nj} Ŋ ŋ {NG} {ng} O o {OO} {oo} P p R r S s T t U u {UU} {uu} Ú ú {ÚU} {úu} W w Y y Ƴ ƴ" + marks: "◌́ ◌̃ ◌̈" + auxiliary: "ˈ" +} +source: "Massaer Mbengue, Daniel Morgan, Manuel pour lire et écrire le ndút, SIL Sénégal, 2021" +note: "Ꞌ ꞌ are on the Keyman SIL Senegal Ndut keyboard instead of ˈ (used in Mbengue & Morgan 2021 among others)." From f12ce74c7a45537ca9cab4280d3610d49eddab17 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Thu, 16 Nov 2023 11:03:57 +0100 Subject: [PATCH 04/14] Add cae_Latn --- Lib/gflanguages/data/languages/cae_Latn.textproto | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 Lib/gflanguages/data/languages/cae_Latn.textproto diff --git a/Lib/gflanguages/data/languages/cae_Latn.textproto b/Lib/gflanguages/data/languages/cae_Latn.textproto new file mode 100644 index 00000000..a0636509 --- /dev/null +++ b/Lib/gflanguages/data/languages/cae_Latn.textproto @@ -0,0 +1,15 @@ +id: "cae_Latn" +language: "cae" +script: "Latn" +name: "Lehar" +preferred_name: "Laalaa" +autonym: "laalaa" +population: 19000 +region: "SN" +exemplar_chars { + base: "A a Á á B b Ɓ ɓ C c D d Ɗ ɗ E e É é Ë ë F f G g H h I i Í í J j K k L l M m N n Ñ ñ {NJ} {nj} Ŋ ŋ {NG} {ng} O o Ó ó P p R r S s T t U u Ú ú W w Y y Ƴ ƴ Ɂ '" + marks: "◌́ ◌̃ ◌̈" + auxiliary: "ꞌ ˈ" +} +source: "Christina Thornell, Ndiol Malick Tine, Roger Samba Faye & Gilbert Guilang Thiaw, Lexique langue cangin laalaa – français, Dakar: Sé Wínóo, SIL, 2016" +note: "Ɂ ' are used as a casing pair in Thornell et al. 2016." From e49f8a3543f4cd9248f010dc9d164bc50c63c7d6 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Fri, 17 Nov 2023 21:07:44 +0100 Subject: [PATCH 05/14] Update sms_Latn based on Feist 2010 --- Lib/gflanguages/data/languages/sms_Latn.textproto | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Lib/gflanguages/data/languages/sms_Latn.textproto b/Lib/gflanguages/data/languages/sms_Latn.textproto index 7c022c43..b91996be 100644 --- a/Lib/gflanguages/data/languages/sms_Latn.textproto +++ b/Lib/gflanguages/data/languages/sms_Latn.textproto @@ -5,8 +5,11 @@ name: "Skolt Sami" autonym: "Nuõrttsääʼmǩiõll" population: 612 region: "FI" +region: "RU" exemplar_chars { - base: "A B C D E F G H I J K L M N O P Q R S T U V X Y Z Â Ä Å Ö Õ Č Đ Ŋ Š Ž Ǥ Ǧ Ǩ Ǯ Ʒ a b c d e f g h i j k l m n o p q r s t u v x y z â ä å ö õ č đ ŋ š ž ǥ ǧ ǩ ǯ ʒ ʼ" - auxiliary: "Å Ö Ø å ö ø" + base: "A B C D E F G H I J K L M N O P Q R S T U V X Y Z Â Ä Å Ö Õ Č Đ Ŋ Š Ž Ǥ Ǧ Ǩ Ǯ Ʒ a b c d e f g h i j k l m n o p q r s t u v x y z â ä å ö õ č đ ŋ š ž ǥ ǧ ǩ ǯ ʒ ʼ ʹ" + auxiliary: "Å Ö Ø å ö ø ˊ ´ ˈ" marks: "◌̂ ◌̃ ◌̈ ◌̊ ◌̌" } +source: "Timothy Feist, A grammar of Skolt Saami, University of Manchester, 2010" +note: "ʹ U+02B9 is used to indicate palatalization, ˊ U+02CA or ´ U+00B4 are sometimes used instead. ʼ U+02BC, or ’ U+2019, is used between consonants letters to indicate they are not a digraph: lʼj and nʼj. Some dictionaries use ˈ U+02C8, or ' U+0027, to indicate geminate consonants after diphthongs." From e0694d78193fb7108ce459335826a6d4040d8e80 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Fri, 17 Nov 2023 21:07:56 +0100 Subject: [PATCH 06/14] Add kia_Latn --- Lib/gflanguages/data/languages/kia_Latn.textproto | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 Lib/gflanguages/data/languages/kia_Latn.textproto diff --git a/Lib/gflanguages/data/languages/kia_Latn.textproto b/Lib/gflanguages/data/languages/kia_Latn.textproto new file mode 100644 index 00000000..6a901114 --- /dev/null +++ b/Lib/gflanguages/data/languages/kia_Latn.textproto @@ -0,0 +1,13 @@ +id: "kia_Latn" +language: "kia" +script: "Latn" +name: "Kim" +population: 53000 +region: "TD" +exemplar_chars { + base: "a A à À á Á ā Ā {a̰} {A̰} b B ɓ Ɓ c C d D ɗ Ɗ e E é É ḛ Ḛ f F g G h H i I ḭ Ḭ j J k K l L {ls} {LS} m M {mb} {MB} n N {nd} {ND} {nj} {NJ} ŋ Ŋ {ŋg} {ŊG} o O ó Ó {o̰} {O̰} p P r R s S t T u U ú Ú ṵ Ṵ v V w W y Y z Z" + auxiliary: "q Q x X" + marks: "◌́ ◌̄ ◌̰" +} +source: "Association pour la Promotion de la Langue Kim (APLK), Sigir Wak Kwasap (Les contes en langue kim), APLK Tchad, 2012" +source: "CTBLK, Hate isi wak kwasap, vol. 1-2, CTBLK, 2010" From 39bf76bdbf6cfb8cdf328919bf9058ac01a19819 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Fri, 17 Nov 2023 22:01:39 +0100 Subject: [PATCH 07/14] Update sba_Latn --- Lib/gflanguages/data/languages/sba_Latn.textproto | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/Lib/gflanguages/data/languages/sba_Latn.textproto b/Lib/gflanguages/data/languages/sba_Latn.textproto index 46513205..aefa2781 100644 --- a/Lib/gflanguages/data/languages/sba_Latn.textproto +++ b/Lib/gflanguages/data/languages/sba_Latn.textproto @@ -7,6 +7,12 @@ region: "CM" region: "NG" region: "TD" exemplar_chars { - base: "a A b B d D e E ɛ Ɛ g G h H i I j J k K l L m M n N o O ɔ Ɔ p P r R s S t T u U v V w W y Y" - auxiliary: "c C f F q Q x X z Z" -} \ No newline at end of file + base: "a A à À á Á {a̰} {A̰} {á̰} {Á̰} b B d D e E è È é É ḛ Ḛ {ḛ̀} {Ḛ̀} {ḛ́} {Ḛ́} ə Ə {ə́} {Ə́} {ə̰} {Ə̰} g G h H i I ḭ Ḭ j J k K l L m M n N o O ó Ó {o̰} {O̰} ɔ Ɔ p P r R s S t T u U ú Ú ṵ Ṵ v V w W y Y" + marks: "◌̀ ◌́ ◌̰" + auxiliary: "c C f F ǝ Ǝ {ǝ́} {Ǝ́} {ǝ̰} {Ǝ̰} ɨ Ɨ q Q x X z Z" +} +note: "Hartell 1982 show bb dd ɛ but LETAC 1983 or ABT 1989 don’t use bb dd ɛ and use ɓ ɗ ə, and Chata et al. 2012 uses ɓ ɗ ə ɨ. Tchad 2009 uses Ə, ABT 2015 uses its lowercase but uses uppercase Ǝ." +source: "Alliance Biblique du Tchad, Maktub gə́ To gə Kəmee, first edition 1989, 2015" +source: "Bernard Laumal ge Boukar Selim, Jean-Pierre Caprile & Khalil Alio, Lexiques thématiques de l’Afrique centrale (LETAC) : Tchad, Sara-Ngambay, Activité économiques et sociales 1, Paris: Yaoundé, Agence de coopération culturelle et technique, Centre régional de recherche et de documentation sur les traditions orales et pour le développement des langues africaines, 1983" +source: "Nangone Jacob Chata, Simeon Mbayrem Djimadoum & John M. Keegan, Petit dictionnaire de la langue ngambaye, Cuenca: Morkeg Books, coll. The Sara Language Project, 2012" +source: "Tchad, Décret fixant l’Alphabet National du Tchad, 2009" From 1476655fbfd591c1a6cce1bf217fdea7b6e0c838 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Mon, 20 Nov 2023 11:49:51 +0100 Subject: [PATCH 08/14] Add gvl_Latn --- Lib/gflanguages/data/languages/dnj_Latn.textproto | 4 ++-- Lib/gflanguages/data/languages/gvl_Latn.textproto | 15 +++++++++++++++ 2 files changed, 17 insertions(+), 2 deletions(-) create mode 100644 Lib/gflanguages/data/languages/gvl_Latn.textproto diff --git a/Lib/gflanguages/data/languages/dnj_Latn.textproto b/Lib/gflanguages/data/languages/dnj_Latn.textproto index f89e7c46..fb5b62ba 100644 --- a/Lib/gflanguages/data/languages/dnj_Latn.textproto +++ b/Lib/gflanguages/data/languages/dnj_Latn.textproto @@ -5,9 +5,9 @@ name: "Dan" population: 1099244 region: "CI" exemplar_chars { - base: "a A ȁ Ȁ à À ā Ā á Á {a̋} {A̋} â  æ Æ {æ̏} {Æ̏} {æ̀} {Æ̀} ǣ Ǣ ǽ Ǽ {æ̋} {Æ̋} {æ̂} {Æ̂} ʌ Ʌ {ʌ̏} {Ʌ̏} {ʌ̀} {Ʌ̀} {ʌ̄} {Ʌ̄} {ʌ́} {Ʌ́} {ʌ̋} {Ʌ̋} {ʌ̂} {Ʌ̂} b B d D e E ȅ Ȅ è È ē Ē é É {e̋} {E̋} ê Ê ɛ Ɛ {ɛ̏} {Ɛ̏} {ɛ̀} {Ɛ̀} {ɛ̄} {Ɛ̄} {ɛ́} {Ɛ́} {ɛ̋} {Ɛ̋} {ɛ̂} {Ɛ̂} f F i I ȉ Ȉ ì Ì ī Ī í Í {i̋} {I̋} î Î g G h H k K l L n N ŋ Ŋ o O ȍ Ȍ ò Ò ō Ō ó Ó ő Ő ô Ô ɔ Ɔ ɤ {ɤ̏} {ɤ̀} {ɤ̄} {ɤ́} {ɤ̋} {ɤ̂} œ Œ {œ̏} {Œ̏} {œ̀} {Œ̀} {œ̄} {Œ̄} {œ́} {Œ́} {œ̋} {Œ̋} {œ̂} {Œ̂} p P s S t T u U ȕ Ȕ ù Ù ū Ū ú Ú ű Ű û Û ɯ Ɯ {ɯ̏} {Ɯ̏} {ɯ̀} {Ɯ̀} {ɯ̄} {Ɯ̄} {ɯ́} {Ɯ́} {ɯ̋} {Ɯ̋} {ɯ̂} {Ɯ̂} v V w W y Y z Z ʼ" + base: "a A ȁ Ȁ à À ā Ā á Á {a̋} {A̋} â  æ Æ {æ̏} {Æ̏} {æ̀} {Æ̀} ǣ Ǣ ǽ Ǽ {æ̋} {Æ̋} {æ̂} {Æ̂} ʌ Ʌ {ʌ̏} {Ʌ̏} {ʌ̀} {Ʌ̀} {ʌ̄} {Ʌ̄} {ʌ́} {Ʌ́} {ʌ̋} {Ʌ̋} {ʌ̂} {Ʌ̂} b B d D e E ȅ Ȅ è È ē Ē é É {e̋} {E̋} ê Ê ɛ Ɛ {ɛ̏} {Ɛ̏} {ɛ̀} {Ɛ̀} {ɛ̄} {Ɛ̄} {ɛ́} {Ɛ́} {ɛ̋} {Ɛ̋} {ɛ̂} {Ɛ̂} f F i I ȉ Ȉ ì Ì ī Ī í Í {i̋} {I̋} î Î g G h H k K l L n N ŋ Ŋ o O ȍ Ȍ ò Ò ō Ō ó Ó ő Ő ô Ô ɔ Ɔ ɤ Ɤ {ɤ̏} {Ɤ̏} {ɤ̀} {Ɤ̀} {ɤ̄} {Ɤ̄} {ɤ́} {Ɤ́} {ɤ̋} {Ɤ̋} {ɤ̂} {Ɤ̂} œ Œ {œ̏} {Œ̏} {œ̀} {Œ̀} {œ̄} {Œ̄} {œ́} {Œ́} {œ̋} {Œ̋} {œ̂} {Œ̂} p P s S t T u U ȕ Ȕ ù Ù ū Ū ú Ú ű Ű û Û ɯ Ɯ {ɯ̏} {Ɯ̏} {ɯ̀} {Ɯ̀} {ɯ̄} {Ɯ̄} {ɯ́} {Ɯ́} {ɯ̋} {Ɯ̋} {ɯ̂} {Ɯ̂} v V w W y Y z Z ʼ" marks: "◌̋ ◌́ ◌̄ ◌̀ ◌̏ ◌̂ ◌̈" auxiliary: "ë Ë ö Ö ü Ü {ʋ̈} {Ʋ̈} ˗ ꞊ ˮ" } note: "TODO: add uppercase of ɤ and accented forms once it is in Unicode 16.0." -source: "Gué Nestor, Kpan Joséphine, Vydrin Valentin & Zeh Emmanuel, Syllabaire dan de l’Est Livre d’enseignants, Man – Abidjan: Pȁbhɛ̄nbhȁbhɛ̏n - EDILIS, 2020" \ No newline at end of file +source: "Gué Nestor, Kpan Joséphine, Vydrin Valentin & Zeh Emmanuel, Syllabaire dan de l’Est Livre d’enseignants, Man – Abidjan: Pȁbhɛ̄nbhȁbhɛ̏n - EDILIS, 2020" diff --git a/Lib/gflanguages/data/languages/gvl_Latn.textproto b/Lib/gflanguages/data/languages/gvl_Latn.textproto new file mode 100644 index 00000000..8c703818 --- /dev/null +++ b/Lib/gflanguages/data/languages/gvl_Latn.textproto @@ -0,0 +1,15 @@ +id: "gvl_Latn" +language: "gvl" +script: "Latn" +name: "Gulay" +population: 250478 +region: "TD" +exemplar_chars { + base: "a A à À á Á b B ɓ Ɓ c C d D ɗ Ɗ e E é É è È ə Ə {ə́} {Ə́} ɛ Ɛ {ɛ́} {Ɛ́} g G h H i I í Í ì Ì ɨ Ɨ j J k K l L m M {m̀} {M̀} ḿ Ḿ {mb} {MB} n N ń Ń {nd} {ND} {ng} {NG} {nj} {NJ} {ny} {NY} o O ò Ò ó Ó ɔ Ɔ {ɔ́} {Ɔ́} p P r R s S t T u U ù Ù ú Ú {vb} {VB} w W y Y" + marks: "◌̀ ◌́ ◌̂" + auxiliary: "f F q Q v V x X z Z ê Ê {ə̀} {Ə̀} {ɛ̀} {Ɛ̀} {ɨ̀} {Ɨ̀} {ɨ́} {Ɨ́}" +} +note: "According to Ngaradoumbaye 2017: low tone and middle tone are the most frequent, high tone is always marked with acute, low tone is only marked with grave when there is an ambiguity. ABT 2004 does not used ə ɛ. Ngaradoumbaye 2012 do not use ə ɛ either but does use ê." +source: "Alliance Biblique du Tchad, Le Nouveau Testament en langue gulei, Alliance Biblique du Tchad, 2004" +source: "Ngaradoumbaye N. Clément, Alpabe kɨ se ta guləy - Abécédaire en langue gùlə̀y, Université de Yaoundé, 2017" +source: "Ngaradoumbaye Clément, Gotolbaye Chico, Nadjirissengar Moïse & Seouta Blaise, Gostà kumkunjuge se tà guley, ATPLG Tchad (Association pour la Transcription et la Promotion de la Langue Gouley), 2012" From a4e9603f4adaed2205067e0af208ab89fe57e169 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Mon, 20 Nov 2023 11:51:06 +0100 Subject: [PATCH 09/14] test_data_languages: skip dnj_Latn because of future Ramshorn --- tests/test_data_languages.py | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/test_data_languages.py b/tests/test_data_languages.py index 855cbee7..ed7f7e18 100644 --- a/tests/test_data_languages.py +++ b/tests/test_data_languages.py @@ -44,6 +44,7 @@ "hur_Latn": "Does indeed use Greek glyphs while writing Latin", "kwk_Latn": "Does indeed use Greek glyphs while writing Latin", "thp_Latn": "Does indeed use Greek glyphs while writing Latin", + "dnj_Latn": "Does use future Unicode 16 Latin glyphs", } SKIP_REGION = { From 859e319867a51cfd0b201977dab9cc507a01c87b Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Mon, 20 Nov 2023 11:51:26 +0100 Subject: [PATCH 10/14] Add mwm_Latn --- Lib/gflanguages/data/languages/mwm_Latn.textproto | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 Lib/gflanguages/data/languages/mwm_Latn.textproto diff --git a/Lib/gflanguages/data/languages/mwm_Latn.textproto b/Lib/gflanguages/data/languages/mwm_Latn.textproto new file mode 100644 index 00000000..524c31a0 --- /dev/null +++ b/Lib/gflanguages/data/languages/mwm_Latn.textproto @@ -0,0 +1,15 @@ +id: "mwm_Latn" +language: "mwm" +script: "Latn" +name: "Sar" +population: 500000 +region: "TD" +exemplar_chars { + base: "a A á Á ā Ā {a̰} {A̰} {á̰} {Á̰} {ā̰} {Ā̰} b B ɓ Ɓ d D ɗ Ɗ e E é É ē Ē ḛ Ḛ {ḛ́} {Ḛ́} {ḛ̄} {Ḛ̄} ə Ə {ə́} {Ə́} {ə̄} {Ə̄} g G h H i I í Í ī Ī ḭ Ḭ {ḭ́} {Ḭ́} {ḭ̄} {Ḭ̄} ɨ Ɨ j J k K l L ĺ Ĺ m M ḿ Ḿ {m̄} {M̄} {mb} {MB} n N ń Ń {n̄} {N̄} {nd} {ND} {ng} {NG} {nj} {NJ} o O ó Ó ō Ō {o̰} {O̰} {ó̰} {Ó̰} {ō̰} {Ō̰} ɔ Ɔ {ɔ́} {Ɔ́} {ɔ̄} {Ɔ̄} p P r R ŕ Ŕ {r̄} {R̄} s S t T u U ú Ú ū Ū ṵ Ṵ {ṵ́} {Ṵ́} {ṵ̄} {Ṵ̄} v V w W {w̄} {W̄} y Y ý Ý ȳ Ȳ" + auxiliary: "c C f F q Q ŗ Ŗ x X z Z" + marks: "◌̄ ◌̧ ◌̰" +} +note: "ŗ was used in Palayer 1992, Gotengaye & Keen 2016 uses it (represented by ᶉ likely because of comma below instead of cedilla shape in some fonts) but indicated young Sar do not distinguish it from r. OSSEC 2015 does not use it. ABT uses accented letters but not as many as dictionaries and doesn’t use ŗ." +source: "Alliance Biblique du Tchad, Bibəl ta Sar̄, Alliance Biblique du Tchad, 2006, 2010" +source: "Gotengaye Constant & John M. Keegan, Dictionnaire Sar, The Sara-Bagirmi Language Project, Cuenca: Morkeg Books, 2016" +source: "OSSEC, Ta ra dora̰ kə donang (La cosmogonie sar), Organisation Sara pour la Science, l’Éducation et la Culture (OSSEC), 2015" From 3da1de5dbda03d5eb1f20e90bf385ea7677eb24a Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Mon, 20 Nov 2023 13:48:07 +0100 Subject: [PATCH 11/14] Add mge_Latn --- Lib/gflanguages/data/languages/mge_Latn.textproto | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 Lib/gflanguages/data/languages/mge_Latn.textproto diff --git a/Lib/gflanguages/data/languages/mge_Latn.textproto b/Lib/gflanguages/data/languages/mge_Latn.textproto new file mode 100644 index 00000000..982ce7ef --- /dev/null +++ b/Lib/gflanguages/data/languages/mge_Latn.textproto @@ -0,0 +1,13 @@ +id: "mge_Latn" +language: "mge" +script: "Latn" +name: "Mango" +population: 77000 +region: "TD" +exemplar_chars { + base: "a A à À á Á {a̰} {A̰} {à̰} {À̰} {á̰} {Á̰} b B ɓ Ɓ d D ɗ Ɗ e E è È é É ḛ Ḛ {ḛ̀} {Ḛ̀} ə Ə {ə̀} {Ə̀} {ə́} {Ə́} {ə̰} {Ə̰} {ə̰̀} {Ə̰̀} {ə̰́} {Ə̰́} ɛ Ɛ {ɛ̰} {Ɛ̰} g G i I ḭ Ḭ {ḭ̀} {Ḭ̀} ɨ Ɨ {ɨ̀} {Ɨ̀} {ɨ́} {Ɨ́} j J k K l L m M n N {n̰} {N̰} o O ò Ò ó Ó ɔ Ɔ {ɔ̀} {Ɔ̀} {ɔ́} {Ɔ́} {ɔ̰} {Ɔ̰} {ɔ̰̀} {Ɔ̰̀} {ɔ̰́} {Ɔ̰́} p P r R s S t T u U ù Ù ú Ú ṵ Ṵ {ṵ́} {Ṵ́} w W y Y" + marks: "◌̀ ◌́ ◌̰" + auxiliary: "ā Ā {ā̰} {Ā̰} c C ē Ē {ḛ́} {Ḛ́} {ḛ̄} {Ḛ̄} {ə̄} {Ə̄} {ə̰̄} {Ə̰̄} f F h H ī Ī {ɨ̄} {Ɨ̄} ĺ Ĺ {l̄} {L̄} {m̄} {M̄} {n̄} {N̄} ō Ō {ó̰} {Ó̰} {ò̰} {Ò̰} {ō̰} {Ō̰} ū Ū {ṵ̀} {Ṵ̀} {ṵ̄} {Ṵ̄} ý Ý ỳ Ỳ ȳ Ȳ" +} +source: "Dodom Ndildongar Fidele, Loubeta Miclo, Gaston Altoloum & John M. Keegan, Lexique mango, Cuenca: Morkeg Books, The Sara Language Project, 2014" +source: "Ta lə Lubə Kunmindɨ kɨ́ Sigɨ, Wycliffe Bible Translators, 2018" From 9154238a9a6208d661d3a752a07077e9d6328db3 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Tue, 21 Nov 2023 14:02:09 +0100 Subject: [PATCH 12/14] Update rub_Latn --- Lib/gflanguages/data/languages/rub_Latn.textproto | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/Lib/gflanguages/data/languages/rub_Latn.textproto b/Lib/gflanguages/data/languages/rub_Latn.textproto index 3095631b..a1b1ffbd 100644 --- a/Lib/gflanguages/data/languages/rub_Latn.textproto +++ b/Lib/gflanguages/data/languages/rub_Latn.textproto @@ -5,6 +5,9 @@ name: "Gungu" population: 49000 region: "UG" exemplar_chars { - base: "a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P r R s S t T u U v V w W y Y z Z ŋ Ŋ" + base: "a A b B {b̯} {B̯} c C d D e E f F g G h H i I {i̱} {I̱} j J k K l L m M n N {ngh} {NGH} {ny} {NY} o O p P r R s S t T u U {u̱} {U̱} v V w W y Y z Z" marks: "◌̯ ◌̱" -} \ No newline at end of file + auxiliary: "ŋ Ŋ" +} +source: "Lugungu Bible Translation and Literacy Association, Lugungu orthography guide, Entebbe: Lugungu Bible Translation and Literacy Association, SIL international, 2006" +source: "Lugungu Bible Translation and Literacy Association, “The Lugungu Alphabet”, Lugungu Dictionary, Webonary, 2016" From c8d16e5be2826c359ecf41bc141c7ad413357d53 Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Wed, 22 Nov 2023 01:22:29 +0100 Subject: [PATCH 13/14] Fix yre_Latn MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "a A b B" and "ʼ ˮ ˗" missing from base Add sources --- Lib/gflanguages/data/languages/yre_Latn.textproto | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Lib/gflanguages/data/languages/yre_Latn.textproto b/Lib/gflanguages/data/languages/yre_Latn.textproto index 54acfd24..16f9c0c7 100644 --- a/Lib/gflanguages/data/languages/yre_Latn.textproto +++ b/Lib/gflanguages/data/languages/yre_Latn.textproto @@ -5,6 +5,8 @@ name: "Yaouré" population: 40000 region: "CI" exemplar_chars { - base: "c C d D e E ɛ Ɛ f F g G i I ɩ Ɩ j J k K l L m M n N o O ɔ Ɔ p P r R s S t T u U ʋ Ʋ v V w W y Y z Z" + base: "a A b B c C d D e E ɛ Ɛ f F g G {gb} {GB} i I ɩ Ɩ j J k K {kp} {KP} l L m M n N o O ɔ Ɔ p P r R s S {sh} {SH} t T u U ʋ Ʋ v V w W y Y z Z ʼ ˮ ˗" auxiliary: "x X" -} \ No newline at end of file +} +source: "Guide pour lire et écrire le yaouré, SIL International, 1983" +source: "Rhonda L. Hartell, Alphabets of Africa, Dakar: BREDA (UNESCO) & Summer Institute of Linguistics, 1993" From 8ce00d3574a34f730f2f27515a4aabf1f875ad9c Mon Sep 17 00:00:00 2001 From: Denis Moyogo Jacquerye Date: Wed, 22 Nov 2023 01:36:21 +0100 Subject: [PATCH 14/14] Add mev_Latn --- Lib/gflanguages/data/languages/mev_Latn.textproto | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 Lib/gflanguages/data/languages/mev_Latn.textproto diff --git a/Lib/gflanguages/data/languages/mev_Latn.textproto b/Lib/gflanguages/data/languages/mev_Latn.textproto new file mode 100644 index 00000000..f4becb88 --- /dev/null +++ b/Lib/gflanguages/data/languages/mev_Latn.textproto @@ -0,0 +1,14 @@ +id: "mev_Latn" +language: "mev" +script: "Latn" +name: "Mano" +population: 430000 +region: "GN" +region: "LR" +exemplar_chars { + base: "a A à À á Á ã à {ã̀} {Ã̀} {ã́} {Ã́} b B ɓ Ɓ d D e E è È é É ɛ Ɛ {ɛ̀} {Ɛ̀} {ɛ́} {Ɛ́} {ɛ̃̀} {Ɛ̃̀} {ɛ̃́} {Ɛ̃́} f F g G {gb} {GB} {gw} {GW} i I ì Ì í Í k K {kp} {KP} {kw} {KW} l L m M {m̀} {M̀} ḿ Ḿ n N ŋ Ŋ {ŋw} {ŊW} ɲ Ɲ o O ò Ò ó Ó ɔ Ɔ {ɔ̀} {Ɔ̀} {ɔ́} {Ɔ́} {ɔ̃} {Ɔ̃̀} {ɔ̃́} {Ɔ̃́} p P s S t T u U ù Ù ú Ú {ũ̀} {Ũ̀} ṹ Ṹ v V w W y Y z Z" + marks: "◌̀ ◌́ ◌̃ ◌̄ ◌̰" + auxiliary: "{a̰} {A̰} {à̰} {À̰} {á̰} {Á̰} {ā̰} {Ā̰} {ɛ̰} {Ɛ̰} {ɛ̰̀} {Ɛ̰̀} {ɛ̰́} {Ɛ̰́} {ɛ̰̄} {Ɛ̰̄} {ŋ̀} {Ŋ̀} {ŋ́} {Ŋ́} {ŋ̄} {Ŋ̄} {ɔ̰} {Ɔ̰} {ɔ̰̀} {Ɔ̰̀} {ɔ̰́} {Ɔ̰́} {ɔ̰̄} {Ɔ̰̄} ṵ Ṵ {ṵ̀} {Ṵ̀} {ṵ́} {Ṵ́} {ṵ̄} {Ṵ̄}" +} +note: "The LIBTRALO orthography uses ◌̀ ◌́ ◌̃ for low tone, high tone and nasal pronunciation whereas Khachaturyan, Carbo & Mamy 2022 uses ◌̀ ◌́ ◌̄ ◌̰ for low tone, high tone, mid tone and nasal pronunciation. Some authors use n after a nasal vowel instead of using tilde above (or below)." +source: "Maria Khachaturyan, Matilda Carbo & Pe Mamy, “Dictionnaire mano-français suivi d’un index français-mano”, Mandenkan, no. 67, 2022, p. 45-278"