diff --git a/SUPPORTED_LANGUAGES.md b/SUPPORTED_LANGUAGES.md index 0b91404..bf3c0a8 100644 --- a/SUPPORTED_LANGUAGES.md +++ b/SUPPORTED_LANGUAGES.md @@ -78,3 +78,4 @@ and [documentation](https://docs.rs/whatlang/). | Tagalog | tgl | `Lang::Tgl` | | Armenian | hye | `Lang::Hye` | | Welsh | cym | `Lang::Cym` | +| Venetian | vec | `Lang::Vec` | diff --git a/misc/data.json b/misc/data.json index 79dc3aa..ca047f6 100644 --- a/misc/data.json +++ b/misc/data.json @@ -121,7 +121,8 @@ "quy": "chi|nch|hik|una| ka|anc|kun|man|ana|aq |cha|aku|pas|as |sqa|paq|nan|qa |apa|kan|ikp|ik |ech|spa| de|pa |cho|ere|der|rec|am | ru|an | ma| ch|kpa|asq|ta |na |nam|nak|taq|a k|qan|ina|run|lli|ach|nap|pi |mi | ll|yoq|asp|ima|hay|hin|aqa|nku|ant|ayn|oyo| hi| im|hoy|cio|nta|nas|q k|api|iw |wan|kuy|kay|liw|aci|ion|ipa|lla|oq |npa|ay |kas|a m|nac| na|inc|all|ama|ari|anp| ya|chu| hu|nin|pip|i k|qmi|hon|w r|ata|awa|a c|ota|in |yku|yna| wa|a h|has|a d|iku|a l| li|pan|ich|may| pi| ha|onc|a r|onk| ot|ku | qa|ank|aqm|mun|anm|hu |a p|nma| mu|qta|n h|pap|isq|yni|ikm|ma |wsa|aws|kaw|ibr|bre|lib|ayk|usp|nqa|e k| al|lin|n k|re |ara|nat|yac|kma|war|huk|uwa|yta|hwa|chw| sa|was|kus|yan|m d|kpi|q m|a i|q l|kin|tap|a a|kta|ikt|i c|a s|uy | ca|qaw|uku| tu| re|aqt|ask|qsi|sak|uch|q h|cas|tin|pak|ris|ski|sic|q d|nmi|s l|naq|tuk|mpa|a y|k c|uma|ien|ypi| am|qaq|qap|eqs|ayp|req|qpa|aqp|law|ayt|q c|pun| ni|a q|ruw|i h|haw|n c| pa|amp|par|k h| le|yma|ñun|ern|huñ|nni|n r|anq|map|aya|tar|s m|uñu|ten|val|ura|ita|arm|isu|s c|onn|igu| ri|qku|naw|k l|u l|his|ley|say|s y|rim|aru|rma|sun|ier|s o|qar|n p|a f|a t|esq|n a|oqm|s i|awk| va|w n|hap|lap|kup|i r|kam|uyk|sap| qe|ual|m p|ran|nya|gua| pe| go|gob|maq|sum|ast| su| ig", "rmn": "aj |en | te|te | sa| le|aka|pen| si| e |el |ipe|si |kaj|sar| th|and| o |sav|qe |les| ma|es | ha|j t|hak|ja |ar |ave| an|a s|ta |i l|ia |nas| aj|ne | so|imn|mna|sqe|esq|nd |tha|haj|e s|e t|e a|enq|asq|man| ja|kan|e m| i | ta|the|mes|cia|bar|as |isa|utn|qo |hem|o s|s s| me|vel|ark|i t| na|kas|est| ba|s h|avo| di|ard| bi| pe|rka|lo | ak|ika|e r|a a| pr|e k|qi |mat|ima|e p|a t| av|e d|r s|n s|anu|nuś|o t|avi|orr|o a| ka| re|n a|re |aja|e o|sqo|sti| ov|õl |l p|nqe|ere|d o|vor|so |no |dik|rel|ove|n t|ve |e b|res|tim|ren| de|àci|o m|i a|but|len|ali|ari|rre|de | pa|ver| va|sqi|ara|ana|vip|rak|ang|vi | ra|or |ker|i s|eme|e z|ata|e l|a e|rip|rim|akh|la |o p|kar|e h|a p|na |ane|rin|ste|j b|er |ind|ni |tne| ph|nip|r t| ke|ti |are|ndo| je|l a|uśi|e n|khi| bu|kon|lim|al |tar|ekh|jek|àlo|o k| ko|rde|rab|aba| zi|ri |aća|ćar|śik|dõl|dor|on |ano|ven| ni|śaj| śa|khe|ća |ast|j s|uti|uni|tni|naś|i d|mut| po|i p|a m| pu|a l|l s|som|n n|ikh|nik|del|ala|ris|pes|pe |j m|enć|e e|nća|ndi|rdõ|kri|erd|śka|emu|men|alo|nis|aśt|śti|amu|kh |tis|uj |j p|do |ani|ate|nda|o b|nge|o z|soc|a d|muj|o j|da |pri|rdo| as|cie|l t|ro |i r|kla|ing|a j| ze|zen|j e|ziv|hin|aśk| st|maś|ran|pal|khl|mam|i b|oci|rea|l o|nqo| vi|n e", "lat": "is |et |us |um | et|ae |tat|ati| co|que|ue |ion| qu|em |ent|oni|est| su| iu| in| po|tio|tes|tis|ate|bus|e i|ita|ibu|ium|ius|qui|nti|eri|es |s p|con|s e|per|end|pot|ote| ha|nis| pr|s i|abe|uis|am |uae|tem|hab|bet|m h|ndi| ho|sta| de|sua|isq|squ|ter|ici|min|iur|one| re|hom| di| om|omn|rum|s a|t c|rat|lib|ibe|m e| pe|gen| li|ert|ine|nte|nem|ri |ber|tia|e q|dis| ip|ips| ad|di |nes|e s|e c|m p|s c| ve|e p| pa|ili| ge|a e|i p|nt |omi|atu|tur|rit| si|ne |psi|in |ia |ra |ari| cu|vit|rta|mo |to |mni|s h|e e|int|siu|m c|qua|t p|ivi|ini|ut |re |ers|it |s s|iae| es|t s|and| ne|pro| nu|st | ex|nda|cie|nib|t a|ere|tri|nit| at|tiu|ta |ris| ci|civ|ni |uri|ur |rim| vi|par|ad |ess|lic|i i| so| pu| op|rae| fa|s v| ut|dem|se |ons|o e|ria| se|e a| mo|leg|atq|tqu|com|te |niu|ien|vel|el | ma|t e|iis|gni|equ|oci|cip|ura|unt|s d|t i|ali|quo|ect| te|a s|t d| do|tut|ant|isc|ina|men|sin|ua |pra|oru|omm|eta|s n|a p|tum|iam|io |i c|sti| au|ver| ae|ito|dic|imi|s l|e d|fic|cia|t o|pub|ubl|bli|mun|i s|soc|aru|lar|ull|ori|t h|i e|sse|omo|cto|itu|tus| ea|ea |aeq|gio|ui |m s|er |m r| ra| fi|ffi|cog|da | le|mod|a c|mqu|nul|e o|era|ten|ntu|spe|o n|emo|cri|s f| ca|de |a d|rel|ii |ene| tu|sui|rti|sci|nae|m q|m a|egi|ces", - "cym": "yn | yn|dd | ma|ae |mae|au | y |d y|edd| r |ydd| ar| i |n y| o | cy|th | gw|ddi|eth|oed|ol |ar | gy| dd|wyd| ei| n | a |yd |odd| ga|aet|an | rh|iad|io |n a|ei |yr |wn |n c| ll| ca|n g|di |wed| me|od |el |n d|edi|r y|ith| we|ad | fe|er |r a|dau| da| am|d a|on |ch |l y|ddo| he| ch|roe| hy|e r| di|ynn| yr|dda|r g|gan|ir |ewn| ro|en | dy|fod| ff|iau|ll |mew| ym| de|id | sy|yw |dia|hyn|fyd|i g| un|eu |i d|nol|lla|u a|eit| ac|dol|i r|wy |dio|cyn|fel| ni|o r|idd|rth| go|l a|ai |efy|dyn| bo|rha|ed | dr|rwy|ada|n f|wyr|fer|ac |n e|rdd|aid|ael|all|nt |ion| tr|nyd|ach|gyf|cyf|r d|ig |h y|chw|ell|n b|d e|n o| by| ne|da | be|han|nia| oe|d o|r c|d g|dde|r o|af |ara|ni |n s| pe|lwy|gwe|i a|wr | br|in |gol| ge|rch|hef| ad|nod|nna|gyd| fa|un |d h| ys|d i|y d|e n|ria|es | an|dwy|am |ysg|y g|wyn|u c|l e|i f|gwy|efn|ddy|y c|dig|wys| eu|yda|n h|ych|thi|ant| yw|wei| ba|d c|n n|s y|yst|ryd|na |o a|i n|n m|u g|d d|law|i w|n i|n r| fo|ys |iae| co|do |lia|red|nd |y n|hau| ha|neu|u y|rhy|u r|bod| pr| ce|rae|gor|enn|gwa| pa|i c| er|lyn|rai|rif|ian|lli|nau|r h|lan|nwy|yfe|tha|r e|d m|diw|os |lle|ang| se|ddw|al |lad|o g|cae|ann|oli|a r|r b|rio|hyd|ait|aen|u d|no |d b| si|fan|lly|u h|o d|i b|dar|sgo|yng|dod|u n" + "cym": "yn | yn|dd | ma|ae |mae|au | y |d y|edd| r |ydd| ar| i |n y| o | cy|th | gw|ddi|eth|oed|ol |ar | gy| dd|wyd| ei| n | a |yd |odd| ga|aet|an | rh|iad|io |n a|ei |yr |wn |n c| ll| ca|n g|di |wed| me|od |el |n d|edi|r y|ith| we|ad | fe|er |r a|dau| da| am|d a|on |ch |l y|ddo| he| ch|roe| hy|e r| di|ynn| yr|dda|r g|gan|ir |ewn| ro|en | dy|fod| ff|iau|ll |mew| ym| de|id | sy|yw |dia|hyn|fyd|i g| un|eu |i d|nol|lla|u a|eit| ac|dol|i r|wy |dio|cyn|fel| ni|o r|idd|rth| go|l a|ai |efy|dyn| bo|rha|ed | dr|rwy|ada|n f|wyr|fer|ac |n e|rdd|aid|ael|all|nt |ion| tr|nyd|ach|gyf|cyf|r d|ig |h y|chw|ell|n b|d e|n o| by| ne|da | be|han|nia| oe|d o|r c|d g|dde|r o|af |ara|ni |n s| pe|lwy|gwe|i a|wr | br|in |gol| ge|rch|hef| ad|nod|nna|gyd| fa|un |d h| ys|d i|y d|e n|ria|es | an|dwy|am |ysg|y g|wyn|u c|l e|i f|gwy|efn|ddy|y c|dig|wys| eu|yda|n h|ych|thi|ant| yw|wei| ba|d c|n n|s y|yst|ryd|na |o a|i n|n m|u g|d d|law|i w|n i|n r| fo|ys |iae| co|do |lia|red|nd |y n|hau| ha|neu|u y|rhy|u r|bod| pr| ce|rae|gor|enn|gwa| pa|i c| er|lyn|rai|rif|ian|lli|nau|r h|lan|nwy|yfe|tha|r e|d m|diw|os |lle|ang| se|ddw|al |lad|o g|cae|ann|oli|a r|r b|rio|hyd|ait|aen|u d|no |d b| si|fan|lly|u h|o d|i b|dar|sgo|yng|dod|u n", + "vec": " de|el |de | el| co| in|ła |te | ła|to |a d| pa| la|xe |e d| i |e s|he |nte|a s| e | xe| se|par|e e|la |ent|o d|on |a c|che|e c|ar | st|ta | ch|ion|e l|in |e ł|na |a p|se | a |ra |e i| da|da |ti |ga |łe |e p| pr|e a| so|o e|a e| ga|sio| ca|al |con|re |int| ma|co |i d|so |ia |tà | al|sta|a i|ca |a a| an|del|o c|ant|a l| po| na| di|no |i i| un|l s|le | no| le|ni |nti|do |io |a ł|o i| te|i c|men|o s|asi|o a|est|i s|anc| łe|ri | re|i e|n d|era|ma |l p|tra|un |o p| do|sa |ist|a f|i p|e m|a m|l g|i a|e n|ro |ter| vi| ri|l c| si|sto|ari|e g|ne |and| mo|eri|me | fa| pi| fi|ghe|nca|ori|a n| ve|tan|nta|pre|l x|nto|com|ndo|e t|o l| tr|ont|are|ran|ani|art| su|a g|tor|ste|a x|ita|sti|e r|ver|a v|ome|pro|va |a r|res| me|e f|str|a t|ost|ica|po |à d|n c| fo|ort|ato| qu|ei |man| gr|e v|ian|ai |l d|n s| sc|pri|o ł| sa|sen|tro|ntr|en |ona|isi|an |per|ità|esi|ond|alt|n p| cu|ens|ora| ba| li| gh|cia|ame|ati|łi |col|e x|gra|for| pe|stà|i g|ło |n t|n a|can|ria|rte|e o|tri|opo|fin|ito|l m|nsi|go |ol |e u|rim|tar|mar|ałe|ałi| ze|ie |ina|ers|o m|dei|por| on|ura|ven|ara|ega|nsa|ze |n e|rio|cor|ico|di |er |i m| ta|mo |ien| va|egn|ten|dal|chi|ric|nda|ata| ci|qua|ser| tu|teg|rà |a o|o n|à a|si |tut|l t|ea |sia" }, "Cyrillic": { "rus": " пр| и |рав|ств| на|пра|го |ени|ове|во | ка|ани|ть | в | по| об|ия |сво| св|лов|на | че|ело|о н| со|ост|чел|ие |ого|ет |ния|ест|аво|ый |ажд| им|ние|век| не|льн|ли |ова|име|ать|при|т п|и п|каж|или|обо| ра|ых |жды| до|дый|воб|ек |бод|ва |й ч|его|ся |и с|ии |аци|еет|но |мее|и и|лен|ой |тва|ных|то | ил|к и|енн| бы|ию | за|ми |тво|и н|о п|ван|о с|сто|аль| вс|ом |о в|ьно|их |ног|и в|нов|ако|про|ий |сти|и о|пол|олж|дол|ое |бра|я в| ос|ным|жен|раз|ти |нос|я и| во|тор|все| ег|ей |тел|не |и р|ред|ель|тве|оди| ко|общ|о и| де|има|а и|чес|ним|сно|как| ли|щес|вле|ься|нны|аст|тьс|нно|осу|е д| от|пре|шен|а с|бще|осн|одн|быт|сов|ыть|лжн|ран|нию|иче|ак |ым |ват|что|сту|чен|е в| ст|рес|оль| ни|ном|род|ля |нар|вен|ду |оже|ны |е и| то|вер|а о|зов|м и|нац|ден|рин|туп|ежд|стр| чт|я п|она|дос|х и|й и|тоя|есп|лич|бес|обр|ото|о б|ьны|ь в|нии|е м|ую | мо|ем | ме|аро| ре|ава|кот|ав | вы|ам |жно|ста|ая |под|и к|ное| к | та| го|гос|суд|еоб|я н|ен |и д|мож|еск|ели|авн|ве |ече|уще|печ|дно|о д|ход|ка | дл|для|ово|ате|льс|ю и|в к|нен|ции|ной|уда|вов| бе|оро|нст|ами|циа|кон|сем|е о|вно| эт|азо|х п|ни |жде|м п|ког|от |дст|вны|сть|ые |о о|пос|сре|тра|ейс|так|и б|дов|му |я к|нал|дру| др|кой|тер|ь п|арс|изн|соц|еди|олн", @@ -164,4 +165,4 @@ "heb": "ות |ים |כל |ת ה| כל|דם |אדם|יות| של| זכ|ל א| אד|של |ל ה|אי |ויו|כאי|ת ו|י ל|זכא| ול|לא | וה|רות|זכו|ית |ירו|ין | או|ם ז| לא| הח|או | הא| וב| המ|חיר|ת ל|יים|ם ל|את |ת ב|ת ש|רה |ון | לה|נה |כוי|ותי|ה ש|ו ל|ו ב| הו|ת א|ם ב|ם ו|תו | את|לה |ני |אומ| במ|דה |א י|ה ה|ה ב|על |ם ה| על|הוא|וך |ה א|בוד|וד |ואי|נות|ה ו|ת כ|י ה|יה |ם ש|ו ו| שה|ם א|ו כ|ינו|ן ה| שו|שוו|החי|כות|לאו|בות|דות|ה ל|לית|ה מ| בי|וה |וא | הי| לפ|ור | לב|ל ב|בחי|הכר|לו |ת מ|ן ש|החו|ה כ| בכ|ומי|בין|ן ו|ן ל|רוי|פלי|ולה|ליה| הז|חינ| לע| בנ|יבו|חוק| אח|חבר| יה| חי|מי |ירה| חו|האד|ווה|חופ|ופש|וק |נו |יו |ל מ|מדי|כבו| הע|נוך| הד|י א|י ו| הכ|בני|עה |ו א|רצו|דינ|בזכ|מות|יפו| אל|סוד|לם |איש|רך | אי|הגנ|הם |פי |ם כ|חות|ל ו|איל|ילי|תיה|כלל|אלי|יסו|האו|זש | בא|ר א|ו ה|זו |אחר| הפ| בע| בז|משפ| בה| לח|דרך|ומו| בח| דר| מע|ל י|תוך|מנו| בש|לל |רבו| למ|פני| לק|תם |שה |שית|ללא|לפי|היה|מעש|דו |שות|להג|וצי|שוא|אין|וי |תי |ונו|ליל| לו|חיי|ל ז| זו|היא|יא |נתו|ה פ|לת |ובי| לכ|ך ה|יל |י ש|שיו|ן ב|עול|המד|ודה|ולם| ומ|א ה|ולא| בת|הכל| סו| מש| עב|סוצ|ארצ| אר|ציא|ד א|לחי|הן |יחס| יח|יאל|הזכ|ם נ| שר|בו |עבו|היס| לי|ת ז|פול|יהי|גבל|תיו|המא|שהי|א ל|מאו| יו|ותו|ישי|גנה|פשי|וחד|יהם|חרו|לכל|ידה|עות|ונה|ום |חה |עם |שרי|ם י|שר |והח| אש| הג|ק ב|הפל|נשו|הגב|ד ו", "yid": " פֿ|ון |ער |ן א| אַ|דער|ט א| או|און|אַר|ען |פֿו| אױ| אי|ן פ|ֿון|רעכ| דע| רע|עכט|פֿא|ן ד|כט | די|די |אַ |אױף|ױף |ֿאַ| זײ| גע|אַל|אָס| אָ|ונג| הא|האָ|זײַ| מע|אָל|נג |װאָ|ַן |אַנ|רײַ| װא|ָס |באַ| יע|יעד|ניט|ן ז|ר א|יט |אָט|אָר|עדע|מען|זאָ|ָט |פֿר|ײַן| בא|טן |אין|ן ג|ין |ן װ|נאַ|ֿרײ|ר ה| זא|לעכ|ע א|אָד|ַ ר|ענט|אַצ|ַצי|אָנ| צו| װע|יז |מענ|ָדע|איז|ן מ|ַלע|בן |ר מ|טער| מי| פּ|מיט|טלע|ָל |עכע|ײט |ַנד|ע פ|לע |געז|לאַ|אַפ|עזע|ראַ| ני|ַפֿ|רן |ײַנ|נען|טיק|כע |פֿע|יע |הײט|ַהײ|נטש|ײַה|ט ד|ן ב|לן |ן נ|פֿט|שאַ|רונ| זי| װי|ט פ| דא|טאָ|דיק|קן |ר פ|ר ג|יקן|אָב|ף א|אַק|קער|ערע|כער|י פ|ות |ַרב|פּר|קט |עם |יאָ|ציע|ציא|יט־|צו |ישע| קײ|ן ק|סער| גל|דאָ|ונט|גן |ַרא|יקע| טא|ענע|לײַ|שן |ַנע|יק |טאַ|ס א|עט |נגע|ט־א|ָנא|־אי|יקט|נטע|ײנע|־ני|ָר |װער|י א|ן י|יך |זיך|ער־|ערן|אױס|ָבן|נדע|ָסע|װי |ֿעל|ר־נ|ן ה| גר|גלײ| צי|ראָ|זעל|עלק|נד |לקע|אָפ| כּ|ט װ|ג א| נא|ט צ|ר ד|עס |דור|גען|קע |ג פ|ֿט |ן ל|שע |ר ז|רע |ײטן|פּע|קלא|קײט|יטע|ים |ס ז|ײַ | דו|אַט| לא|ר װ|קײנ|עלש|י ד|לשא|יות|נט |ַרז|ע ר|ל ז|אַמ|ן ש| שו|אינ|נטל| הי|בעט|ָפּ|ף פ|ײַכ|בער|ן צ|מאָ| שט| לע|גער|ורך|רך |נעם|גרו|פֿן|לער|װעל|ע מ|ום |שפּ|ך א|יונ|רבע|עפֿ|טעט|ן כ|רעס|ערצ|ז א|עמע|ם א|שטע|כן |רט |י ג|סן |נער|ליט|ט ז|נעמ|ּרא|היו|אַש|ת װ|אומ|ק א|יבע|ֿן |ץ א|פֿי|ײן |ם ט" } -} +} \ No newline at end of file diff --git a/misc/supported_languages.csv b/misc/supported_languages.csv index 6a6027e..1d226b0 100644 --- a/misc/supported_languages.csv +++ b/misc/supported_languages.csv @@ -69,3 +69,4 @@ cat,Catalan,Català,10 tgl,Tagalog,Tagalog, hye,Armenian,Հայերեն,7 cym,Welsh,Cymraeg,0.5 +vec,Venetian,Vèneto,3 diff --git a/src/alphabets/latin.rs b/src/alphabets/latin.rs index 5439647..c55c719 100644 --- a/src/alphabets/latin.rs +++ b/src/alphabets/latin.rs @@ -43,6 +43,7 @@ const UZB: &str = "abcdefghijklmnopqrstuvxyzʻ"; const VIE: &str = "abcdefghijklmnopqrstuvwxyzàáâãèéêìíòóôõùúýăđĩũơưạảấầẩẫậắằẳẵặẹẻẽếềểễệỉịọỏốồổỗộớờởỡợụủứừửữựỳỵỷỹ"; const ZUL: &str = "abcdefghijklmnopqrstuvwxyz"; +const VEC: &str = "abcdefghilmnopqrstuvxyzàèéòóùł"; const LATIN_ALPHABETS: &[(Lang, &str)] = &[ (Lang::Afr, AFR), @@ -81,6 +82,7 @@ const LATIN_ALPHABETS: &[(Lang, &str)] = &[ (Lang::Tur, TUR), (Lang::Uzb, UZB), (Lang::Vie, VIE), + (Lang::Vec, VEC), (Lang::Zul, ZUL), ]; @@ -165,8 +167,8 @@ mod tests { let outcome = alphabet_calculate_scores(&text, &filter); assert_eq!(outcome.count, 50); - assert_eq!(outcome.raw_scores.len(), 37); - assert_eq!(outcome.scores.len(), 37); + assert_eq!(outcome.raw_scores.len(), 38); + assert_eq!(outcome.scores.len(), 38); let raw_scores_for = |lang: Lang| { outcome diff --git a/src/core/detect.rs b/src/core/detect.rs index 8fb4218..70b2c0c 100644 --- a/src/core/detect.rs +++ b/src/core/detect.rs @@ -141,6 +141,7 @@ mod tests { Lang::Swe, Lang::Nob, Lang::Tgl, + Lang::Vec, Lang::Cym, ]); let options = Options::new().set_filter_list(filter_list); diff --git a/src/lang.rs b/src/lang.rs index b91f15b..44a36c2 100644 --- a/src/lang.rs +++ b/src/lang.rs @@ -226,9 +226,12 @@ pub enum Lang { /// Cymraeg (Welsh) Cym = 69, + + /// Vèneto (Venetian) + Vec = 70, } -const VALUES: [Lang; 70] = [ +const VALUES: [Lang; 71] = [ Lang::Epo, Lang::Eng, Lang::Rus, @@ -299,6 +302,7 @@ const VALUES: [Lang; 70] = [ Lang::Tgl, Lang::Hye, Lang::Cym, + Lang::Vec, ]; fn lang_from_code>(code: S) -> Option { @@ -373,6 +377,7 @@ fn lang_from_code>(code: S) -> Option { "tgl" => Some(Lang::Tgl), "hye" => Some(Lang::Hye), "cym" => Some(Lang::Cym), + "vec" => Some(Lang::Vec), _ => None, } } @@ -449,6 +454,7 @@ fn lang_to_code(lang: Lang) -> &'static str { Lang::Tgl => "tgl", Lang::Hye => "hye", Lang::Cym => "cym", + Lang::Vec => "vec", } } @@ -524,6 +530,7 @@ fn lang_to_name(lang: Lang) -> &'static str { Lang::Tgl => "Tagalog", Lang::Hye => "Հայերեն", Lang::Cym => "Cymraeg", + Lang::Vec => "Vèneto", } } @@ -599,6 +606,7 @@ fn lang_to_eng_name(lang: Lang) -> &'static str { Lang::Tgl => "Tagalog", Lang::Hye => "Armenian", Lang::Cym => "Welsh", + Lang::Vec => "Venetian", } } @@ -708,7 +716,7 @@ mod tests { #[test] fn test_all() { - assert_eq!(Lang::all().len(), 70); + assert_eq!(Lang::all().len(), 71); let all = Lang::all(); assert!(all.contains(&Lang::Ukr)); assert!(all.contains(&Lang::Swe)); diff --git a/src/scripts/lang_mapping.rs b/src/scripts/lang_mapping.rs index 5ebb148..b63b588 100644 --- a/src/scripts/lang_mapping.rs +++ b/src/scripts/lang_mapping.rs @@ -1,7 +1,7 @@ use super::Script; use crate::Lang; -const LATIN_LANGS: [Lang; 37] = [ +const LATIN_LANGS: [Lang; 38] = [ Lang::Spa, Lang::Eng, Lang::Por, @@ -39,6 +39,7 @@ const LATIN_LANGS: [Lang; 37] = [ Lang::Est, Lang::Lat, Lang::Cym, + Lang::Vec, ]; const CYRILLIC_LANGS: [Lang; 6] = [ Lang::Rus, diff --git a/src/trigrams/profiles.rs b/src/trigrams/profiles.rs index 1049de2..cba17b4 100644 --- a/src/trigrams/profiles.rs +++ b/src/trigrams/profiles.rs @@ -11294,6 +11294,311 @@ pub static LATIN_LANGS: LangProfileList = &[ Trigram('u', ' ', 'n'), ], ), + ( + Lang::Vec, + &[ + Trigram(' ', 'd', 'e'), + Trigram('e', 'l', ' '), + Trigram('d', 'e', ' '), + Trigram(' ', 'e', 'l'), + Trigram(' ', 'c', 'o'), + Trigram(' ', 'i', 'n'), + Trigram('ł', 'a', ' '), + Trigram('t', 'e', ' '), + Trigram(' ', 'ł', 'a'), + Trigram('t', 'o', ' '), + Trigram('a', ' ', 'd'), + Trigram(' ', 'p', 'a'), + Trigram(' ', 'l', 'a'), + Trigram('x', 'e', ' '), + Trigram('e', ' ', 'd'), + Trigram(' ', 'i', ' '), + Trigram('e', ' ', 's'), + Trigram('h', 'e', ' '), + Trigram('n', 't', 'e'), + Trigram('a', ' ', 's'), + Trigram(' ', 'e', ' '), + Trigram(' ', 'x', 'e'), + Trigram(' ', 's', 'e'), + Trigram('p', 'a', 'r'), + Trigram('e', ' ', 'e'), + Trigram('l', 'a', ' '), + Trigram('e', 'n', 't'), + Trigram('o', ' ', 'd'), + Trigram('o', 'n', ' '), + Trigram('a', ' ', 'c'), + Trigram('c', 'h', 'e'), + Trigram('e', ' ', 'c'), + Trigram('a', 'r', ' '), + Trigram(' ', 's', 't'), + Trigram('t', 'a', ' '), + Trigram(' ', 'c', 'h'), + Trigram('i', 'o', 'n'), + Trigram('e', ' ', 'l'), + Trigram('i', 'n', ' '), + Trigram('e', ' ', 'ł'), + Trigram('n', 'a', ' '), + Trigram('a', ' ', 'p'), + Trigram('s', 'e', ' '), + Trigram(' ', 'a', ' '), + Trigram('r', 'a', ' '), + Trigram('e', ' ', 'i'), + Trigram(' ', 'd', 'a'), + Trigram('d', 'a', ' '), + Trigram('t', 'i', ' '), + Trigram('g', 'a', ' '), + Trigram('ł', 'e', ' '), + Trigram('e', ' ', 'p'), + Trigram(' ', 'p', 'r'), + Trigram('e', ' ', 'a'), + Trigram(' ', 's', 'o'), + Trigram('o', ' ', 'e'), + Trigram('a', ' ', 'e'), + Trigram(' ', 'g', 'a'), + Trigram('s', 'i', 'o'), + Trigram(' ', 'c', 'a'), + Trigram('a', 'l', ' '), + Trigram('c', 'o', 'n'), + Trigram('r', 'e', ' '), + Trigram('i', 'n', 't'), + Trigram(' ', 'm', 'a'), + Trigram('c', 'o', ' '), + Trigram('i', ' ', 'd'), + Trigram('s', 'o', ' '), + Trigram('i', 'a', ' '), + Trigram('t', 'à', ' '), + Trigram(' ', 'a', 'l'), + Trigram('s', 't', 'a'), + Trigram('a', ' ', 'i'), + Trigram('c', 'a', ' '), + Trigram('a', ' ', 'a'), + Trigram(' ', 'a', 'n'), + Trigram('d', 'e', 'l'), + Trigram('o', ' ', 'c'), + Trigram('a', 'n', 't'), + Trigram('a', ' ', 'l'), + Trigram(' ', 'p', 'o'), + Trigram(' ', 'n', 'a'), + Trigram(' ', 'd', 'i'), + Trigram('n', 'o', ' '), + Trigram('i', ' ', 'i'), + Trigram(' ', 'u', 'n'), + Trigram('l', ' ', 's'), + Trigram('l', 'e', ' '), + Trigram(' ', 'n', 'o'), + Trigram(' ', 'l', 'e'), + Trigram('n', 'i', ' '), + Trigram('n', 't', 'i'), + Trigram('d', 'o', ' '), + Trigram('i', 'o', ' '), + Trigram('a', ' ', 'ł'), + Trigram('o', ' ', 'i'), + Trigram(' ', 't', 'e'), + Trigram('i', ' ', 'c'), + Trigram('m', 'e', 'n'), + Trigram('o', ' ', 's'), + Trigram('a', 's', 'i'), + Trigram('o', ' ', 'a'), + Trigram('e', 's', 't'), + Trigram('i', ' ', 's'), + Trigram('a', 'n', 'c'), + Trigram(' ', 'ł', 'e'), + Trigram('r', 'i', ' '), + Trigram(' ', 'r', 'e'), + Trigram('i', ' ', 'e'), + Trigram('n', ' ', 'd'), + Trigram('e', 'r', 'a'), + Trigram('m', 'a', ' '), + Trigram('l', ' ', 'p'), + Trigram('t', 'r', 'a'), + Trigram('u', 'n', ' '), + Trigram('o', ' ', 'p'), + Trigram(' ', 'd', 'o'), + Trigram('s', 'a', ' '), + Trigram('i', 's', 't'), + Trigram('a', ' ', 'f'), + Trigram('i', ' ', 'p'), + Trigram('e', ' ', 'm'), + Trigram('a', ' ', 'm'), + Trigram('l', ' ', 'g'), + Trigram('i', ' ', 'a'), + Trigram('e', ' ', 'n'), + Trigram('r', 'o', ' '), + Trigram('t', 'e', 'r'), + Trigram(' ', 'v', 'i'), + Trigram(' ', 'r', 'i'), + Trigram('l', ' ', 'c'), + Trigram(' ', 's', 'i'), + Trigram('s', 't', 'o'), + Trigram('a', 'r', 'i'), + Trigram('e', ' ', 'g'), + Trigram('n', 'e', ' '), + Trigram('a', 'n', 'd'), + Trigram(' ', 'm', 'o'), + Trigram('e', 'r', 'i'), + Trigram('m', 'e', ' '), + Trigram(' ', 'f', 'a'), + Trigram(' ', 'p', 'i'), + Trigram(' ', 'f', 'i'), + Trigram('g', 'h', 'e'), + Trigram('n', 'c', 'a'), + Trigram('o', 'r', 'i'), + Trigram('a', ' ', 'n'), + Trigram(' ', 'v', 'e'), + Trigram('t', 'a', 'n'), + Trigram('n', 't', 'a'), + Trigram('p', 'r', 'e'), + Trigram('l', ' ', 'x'), + Trigram('n', 't', 'o'), + Trigram('c', 'o', 'm'), + Trigram('n', 'd', 'o'), + Trigram('e', ' ', 't'), + Trigram('o', ' ', 'l'), + Trigram(' ', 't', 'r'), + Trigram('o', 'n', 't'), + Trigram('a', 'r', 'e'), + Trigram('r', 'a', 'n'), + Trigram('a', 'n', 'i'), + Trigram('a', 'r', 't'), + Trigram(' ', 's', 'u'), + Trigram('a', ' ', 'g'), + Trigram('t', 'o', 'r'), + Trigram('s', 't', 'e'), + Trigram('a', ' ', 'x'), + Trigram('i', 't', 'a'), + Trigram('s', 't', 'i'), + Trigram('e', ' ', 'r'), + Trigram('v', 'e', 'r'), + Trigram('a', ' ', 'v'), + Trigram('o', 'm', 'e'), + Trigram('p', 'r', 'o'), + Trigram('v', 'a', ' '), + Trigram('a', ' ', 'r'), + Trigram('r', 'e', 's'), + Trigram(' ', 'm', 'e'), + Trigram('e', ' ', 'f'), + Trigram('s', 't', 'r'), + Trigram('a', ' ', 't'), + Trigram('o', 's', 't'), + Trigram('i', 'c', 'a'), + Trigram('p', 'o', ' '), + Trigram('à', ' ', 'd'), + Trigram('n', ' ', 'c'), + Trigram(' ', 'f', 'o'), + Trigram('o', 'r', 't'), + Trigram('a', 't', 'o'), + Trigram(' ', 'q', 'u'), + Trigram('e', 'i', ' '), + Trigram('m', 'a', 'n'), + Trigram(' ', 'g', 'r'), + Trigram('e', ' ', 'v'), + Trigram('i', 'a', 'n'), + Trigram('a', 'i', ' '), + Trigram('l', ' ', 'd'), + Trigram('n', ' ', 's'), + Trigram(' ', 's', 'c'), + Trigram('p', 'r', 'i'), + Trigram('o', ' ', 'ł'), + Trigram(' ', 's', 'a'), + Trigram('s', 'e', 'n'), + Trigram('t', 'r', 'o'), + Trigram('n', 't', 'r'), + Trigram('e', 'n', ' '), + Trigram('o', 'n', 'a'), + Trigram('i', 's', 'i'), + Trigram('a', 'n', ' '), + Trigram('p', 'e', 'r'), + Trigram('i', 't', 'à'), + Trigram('e', 's', 'i'), + Trigram('o', 'n', 'd'), + Trigram('a', 'l', 't'), + Trigram('n', ' ', 'p'), + Trigram(' ', 'c', 'u'), + Trigram('e', 'n', 's'), + Trigram('o', 'r', 'a'), + Trigram(' ', 'b', 'a'), + Trigram(' ', 'l', 'i'), + Trigram(' ', 'g', 'h'), + Trigram('c', 'i', 'a'), + Trigram('a', 'm', 'e'), + Trigram('a', 't', 'i'), + Trigram('ł', 'i', ' '), + Trigram('c', 'o', 'l'), + Trigram('e', ' ', 'x'), + Trigram('g', 'r', 'a'), + Trigram('f', 'o', 'r'), + Trigram(' ', 'p', 'e'), + Trigram('s', 't', 'à'), + Trigram('i', ' ', 'g'), + Trigram('ł', 'o', ' '), + Trigram('n', ' ', 't'), + Trigram('n', ' ', 'a'), + Trigram('c', 'a', 'n'), + Trigram('r', 'i', 'a'), + Trigram('r', 't', 'e'), + Trigram('e', ' ', 'o'), + Trigram('t', 'r', 'i'), + Trigram('o', 'p', 'o'), + Trigram('f', 'i', 'n'), + Trigram('i', 't', 'o'), + Trigram('l', ' ', 'm'), + Trigram('n', 's', 'i'), + Trigram('g', 'o', ' '), + Trigram('o', 'l', ' '), + Trigram('e', ' ', 'u'), + Trigram('r', 'i', 'm'), + Trigram('t', 'a', 'r'), + Trigram('m', 'a', 'r'), + Trigram('a', 'ł', 'e'), + Trigram('a', 'ł', 'i'), + Trigram(' ', 'z', 'e'), + Trigram('i', 'e', ' '), + Trigram('i', 'n', 'a'), + Trigram('e', 'r', 's'), + Trigram('o', ' ', 'm'), + Trigram('d', 'e', 'i'), + Trigram('p', 'o', 'r'), + Trigram(' ', 'o', 'n'), + Trigram('u', 'r', 'a'), + Trigram('v', 'e', 'n'), + Trigram('a', 'r', 'a'), + Trigram('e', 'g', 'a'), + Trigram('n', 's', 'a'), + Trigram('z', 'e', ' '), + Trigram('n', ' ', 'e'), + Trigram('r', 'i', 'o'), + Trigram('c', 'o', 'r'), + Trigram('i', 'c', 'o'), + Trigram('d', 'i', ' '), + Trigram('e', 'r', ' '), + Trigram('i', ' ', 'm'), + Trigram(' ', 't', 'a'), + Trigram('m', 'o', ' '), + Trigram('i', 'e', 'n'), + Trigram(' ', 'v', 'a'), + Trigram('e', 'g', 'n'), + Trigram('t', 'e', 'n'), + Trigram('d', 'a', 'l'), + Trigram('c', 'h', 'i'), + Trigram('r', 'i', 'c'), + Trigram('n', 'd', 'a'), + Trigram('a', 't', 'a'), + Trigram(' ', 'c', 'i'), + Trigram('q', 'u', 'a'), + Trigram('s', 'e', 'r'), + Trigram(' ', 't', 'u'), + Trigram('t', 'e', 'g'), + Trigram('r', 'à', ' '), + Trigram('a', ' ', 'o'), + Trigram('o', ' ', 'n'), + Trigram('à', ' ', 'a'), + Trigram('s', 'i', ' '), + Trigram('t', 'u', 't'), + Trigram('l', ' ', 't'), + Trigram('e', 'a', ' '), + Trigram('s', 'i', 'a'), + ], + ), ]; /// Languages for script Cyrillic diff --git a/tests/examples.json b/tests/examples.json index 47c7892..9ab9ffb 100644 --- a/tests/examples.json +++ b/tests/examples.json @@ -67,5 +67,6 @@ "slk": "Kodifikačné príručky určujú, ktoré slová sa v slovenčine považujú za spisovné. Ide o 4 zákonom predpísané knihy.", "cat": "Aquest és l’honor més gran que he rebut a la meva vida. La pau ha estat sempre la meva més gran preocupació. Ja en la meva infantesa vaig aprendre a estimar-la. La meva mare – una dona excepcional, genial - , quan jo era noi, ja em parlava de la pau, perquè en aquells temps també hi havia moltes guerres. A més, sóc català. Catalunya va tenir el primer Parlament democràtic molt abans que Anglaterra. I fou al meu país on hi hagué les primeres nacions unides. En aquell temps – segle onzè – van reunir-se a Toluges – avui França – per parlar de la pau, perquè els catalans d’aquell temps ja estaven contra, CONTRA la guerra. Per això les Nacions Unides, que treballen únicament per l’ideal de la pau, estan en el meu cor, perquè tot allò referent a la pau hi va directament. (...) Fa molts anys que no toco el violoncel en públic, però crec que he de fer-ho en aquesta ocasió. Vaig a tocar una melodia del folklore català: El cant dels ocells. Els ocells, quan són al cel, van cantant: 'Peace, Peace, Peace' (pau, pau, pau) i és una melodia que Bach, Beethoven i tots els grans haurien admirat i estimat. I, a més, neix de l’ànima del meu poble, Catalunya.", "tgl": "Sapagkat ang pagkilala sa katutubong karangalan at sa pantay at di-maikakait na mga karapatan ng lahat ng nabibilang sa angkan ng tao ay siyang saligan ng kalayaan, katarungan at kapayapaan sa daigdig. Sapagkat ang pagwawalang-bahala at paglalapastangan sa mga karapatan ng tao ay nagbunga ng mga gawang di-makatao na humamak sa budhi ng sangkatauhan, at ang pagdatal ng isang daigdig na ang mga tao ay magtatamasa ng kalayaan sa pagsasalita at ng kaligtasan sa pangamba at pagdaralita ay ipinahayag na pinakamataas na mithiin ng mga karaniwang tao. Sapagkat mahalaga, kung ang tao ay di-pipiliting manghawakan bilang huling magagawa, sa paghihimagsik laban sa paniniil at pang-aapi, na ang mga karapatan ng tao'y mapangalagaan sa pamamagitan ng paghahari ng batas. Sapagkat mahalagang itaguyod ang pagpapaunlad ng mabuting pagsasamahan ng mga bansa. Sapagkat ang mga mamamayan ng Mga Bansang Nagkakaisa ay nagpatibay sa Karta ng kanilang pananalig sa mga Saligang karapatan ng tao, sa karangalan at ", - "hye": "Հայոց լեզվով ստեղծվել" -} + "hye": "Հայոց լեզվով ստեղծվել", + "vec": "El vèneto el xe na łéngua romansa parlada da çirca kuatro miłioni de persone, soratuto in tel Vèneto, ma anca in tel Friùłi-Venèsia Jùłia, in tel Trentino-Alto Àdexe e in tełe comunidà venete sparse par el mondo." +} \ No newline at end of file