Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* temporal fixings attempt to fixn SH test errors, will fix back Signed-off-by: Alex Cui <[email protected]> * temporal changes will change back Signed-off-by: Alex Cui <[email protected]> * update jp tn date Signed-off-by: Alex Cui <[email protected]> * resolving conflict Signed-off-by: Alex Cui <[email protected]> * adding grammars back in the tokenizer Signed-off-by: Alex Cui <[email protected]> * fixing ci test cases Signed-off-by: Alex Cui <[email protected]> * updats on Jenkins Signed-off-by: Alex Cui <[email protected]> * with pynini closure had errors chaing back to no closure version Signed-off-by: Alex Cui <[email protected]> * jenkinspdate Signed-off-by: Alex Cui <[email protected]> * changing the data format, to align to the blind test data Signed-off-by: Alex Cui <[email protected]> * adding one more test item Signed-off-by: Alex Cui <[email protected]> * temporal fixings attempt to fixn SH test errors, will fix back Signed-off-by: Alex Cui <[email protected]> * adding grammars back in the tokenizer Signed-off-by: Alex Cui <[email protected]> * fixing ci test cases resolving conflicts Signed-off-by: Alex Cui <[email protected]> * with pynini closure had errors chaing back to no closure version Signed-off-by: Alex Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <[email protected]> * resolving fraction space issue Signed-off-by: Alex Cui <[email protected]> * resolving space issue on fraction, added NEMO_NARROW_NON_BREAK_SPACE Signed-off-by: Alex Cui <[email protected]> * resolving space fraction issue added NEMO_NARROW_NON_BREAK_SPACE and NEMO_SPACES_AND_ALHPANUMERICS Signed-off-by: Alex Cui <[email protected]> * fixed typo on decimaltext Signed-off-by: Alex Cui <[email protected]> * removing unsed grammar Signed-off-by: Alex Cui <[email protected]> * removing unsed grammar Signed-off-by: Alex Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removing unsed improts Signed-off-by: Alex Cui <[email protected]> * removing unused import Signed-off-by: Alex Cui <[email protected]> * changed regular space to narrow space Signed-off-by: Alex Cui <[email protected]> * imports error fixing Signed-off-by: Alex Cui <[email protected]> * imports errors Signed-off-by: Alex Cui <[email protected]> * Jekins update for jp itn Signed-off-by: Alex Cui <[email protected]> * update for fraction space issue Signed-off-by: Alex Cui <[email protected]> * update for fraction space issue Signed-off-by: Alex Cui <[email protected]> * update for fraction space issue Signed-off-by: Alex Cui <[email protected]> * reverting Signed-off-by: Alex Cui <[email protected]> * update for fraction space issuel chaing narrow space to regular normal space Signed-off-by: Alex Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixing style Signed-off-by: Alex Cui <[email protected]> * fixng style Signed-off-by: Alex Cui <[email protected]> * style fix Signed-off-by: Alex Cui <[email protected]> * style fix Signed-off-by: Alex Cui <[email protected]> * style fix Signed-off-by: Alex Cui <[email protected]> * removing unsed imports Signed-off-by: Alex Cui <[email protected]> * jp tn date update Signed-off-by: Alex Cui <[email protected]> * Update test_cases_fraction.txt Signed-off-by: Buyuan(Alex) Cui <[email protected]> * removing previously created nemo imports Signed-off-by: Alex Cui <[email protected]> * space issue Signed-off-by: Alex Cui <[email protected]> * test order arrangement Signed-off-by: Alex Cui <[email protected]> * resolve fraction space issue Signed-off-by: Alex Cui <[email protected]> * style fix Signed-off-by: Alex Cui <[email protected]> * fix style Signed-off-by: Alex Cui <[email protected]> * space issue Signed-off-by: Alex Cui <[email protected]> * update jp tn Signed-off-by: Alex Cui <[email protected]> * removing unsed import Signed-off-by: Alex Cui <[email protected]> * Update post_processing.py Signed-off-by: Buyuan(Alex) Cui <[email protected]> * empty file Signed-off-by: Alex Cui <[email protected]> * to delete Signed-off-by: Alex Cui <[email protected]> * removing Signed-off-by: Alex Cui <[email protected]> * add contributing (#21) * add contributing Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * add Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Alex Cui <[email protected]> * add jenkins file (#23) Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Swedish TN (#12) * test now runs, but getting ordinal instead of cardinal Signed-off-by: Jim O'Regan <[email protected]> * force ordinals to either have :a/:e or "." at the end Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <[email protected]> * add minimal ordinal data Signed-off-by: Jim O'Regan <[email protected]> * test runner for ordinals, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * fix test case Signed-off-by: Jim O'Regan <[email protected]> * add // to symbols Signed-off-by: Jim O'Regan <[email protected]> * add test cases for electronic; transformed with sed from spanish, so I expect errors Signed-off-by: Jim O'Regan <[email protected]> * test runner for electronic, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * fixes to make electronic verbaliser work (not yet) Signed-off-by: Jim O'Regan <[email protected]> * fixes to make electronic verbaliser work (not yet) Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move to graph_utils Signed-off-by: Jim O'Regan <[email protected]> * add very minimal test case for fractions Signed-off-by: Jim O'Regan <[email protected]> * test runner for fraction, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * fix language Signed-off-by: Jim O'Regan <[email protected]> * fix graph construction to make pluralisation work Signed-off-by: Jim O'Regan <[email protected]> * test runner for decimal, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * test runner for whitelist, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * add very minimal test case for whitelist Signed-off-by: Jim O'Regan <[email protected]> * test runner for word, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * test runner for date, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * test runner for measure, adapted from es Signed-off-by: Jim O'Regan <[email protected]> * fix a pair of test cases Signed-off-by: Jim O'Regan <[email protected]> * fix plurals Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix number, but this whole thing is only partially adapted Signed-off-by: Jim O'Regan <[email protected]> * fix some test cases Signed-off-by: Jim O'Regan <[email protected]> * add usd$ Signed-off-by: Jim O'Regan <[email protected]> * insert "komma" Signed-off-by: Jim O'Regan <[email protected]> * "pund" is neuter Signed-off-by: Jim O'Regan <[email protected]> * fix test cases Signed-off-by: Jim O'Regan <[email protected]> * towards proper graphs Signed-off-by: Jim O'Regan <[email protected]> * GBP Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix a test case Signed-off-by: Jim O'Regan <[email protected]> * make komma non-det Signed-off-by: Jim O'Regan <[email protected]> * more money tagger fixes Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add more minor words Signed-off-by: Jim O'Regan <[email protected]> * do a bit better with en/ett Signed-off-by: Jim O'Regan <[email protected]> * fix another test case Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * use the correct list Signed-off-by: Jim O'Regan <[email protected]> * fix more test cases Signed-off-by: Jim O'Regan <[email protected]> * fix more test cases Signed-off-by: Jim O'Regan <[email protected]> * make sure the numbers have no 1 Signed-off-by: Jim O'Regan <[email protected]> * abbreviations for million and milliard Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add year suffixes Signed-off-by: Jim O'Regan <[email protected]> * add minimal tests Signed-off-by: Jim O'Regan <[email protected]> * expansions of era abbreviations Signed-off-by: Jim O'Regan <[email protected]> * use eras Signed-off-by: Jim O'Regan <[email protected]> * use eras in verbaliser Signed-off-by: Jim O'Regan <[email protected]> * fix examples in comment Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix extension Signed-off-by: Jim O'Regan <[email protected]> * fix separator Signed-off-by: Jim O'Regan <[email protected]> * date verbaliser is broken, this does not fix it Signed-off-by: Jim O'Regan <[email protected]> * load labels Signed-off-by: Jim O'Regan <[email protected]> * right first time Signed-off-by: Jim O'Regan <[email protected]> * missing space Signed-off-by: Jim O'Regan <[email protected]> * fix year in test cases Signed-off-by: Jim O'Regan <[email protected]> * getting closer to getting dates working Signed-off-by: Jim O'Regan <[email protected]> * add a (failing) test case Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * date working now Signed-off-by: Jim O'Regan <[email protected]> * also handle decades Signed-off-by: Jim O'Regan <[email protected]> * remove todo Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * years where -00 is -hundra Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test runner for telephone (adapted from es) Signed-off-by: Jim O'Regan <[email protected]> * changes to telephone tagger/verbaliser Signed-off-by: Jim O'Regan <[email protected]> * add partially incomplete test data Signed-off-by: Jim O'Regan <[email protected]> * mostly fixed test cases Signed-off-by: Jim O'Regan <[email protected]> * more in progress changes to telephone parts Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * much prodding later, turns out I forgot a space Signed-off-by: Jim O'Regan <[email protected]> * missed wrapping Signed-off-by: Jim O'Regan <[email protected]> * no difference Signed-off-by: Jim O'Regan <[email protected]> * Revert "no difference" This reverts commit 29680925bebd65d489f3b1a5415607c12bb7e3b9. Signed-off-by: Jim O'Regan <[email protected]> * telephone tagger works Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * try adding brackets Signed-off-by: Jim O'Regan <[email protected]> * try adding more brackets Signed-off-by: Jim O'Regan <[email protected]> * fix another test case Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add a comment, because I confused myself Signed-off-by: Jim O'Regan <[email protected]> * move abbreviations Signed-off-by: Jim O'Regan <[email protected]> * add in abbreviations Signed-off-by: Jim O'Regan <[email protected]> * add a version for fraction that does what I intended: 2 & 3 digit numbers without leading 0 are read as cardinals, everything else as digits Signed-off-by: Jim O'Regan <[email protected]> * single digit Signed-off-by: Jim O'Regan <[email protected]> * more test cases Signed-off-by: Jim O'Regan <[email protected]> * add another test case/remove a duplicate Signed-off-by: Jim O'Regan <[email protected]> * use the nice variable I just added to cardinal Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * this is not right; leading zeros fail Signed-off-by: Jim O'Regan <[email protected]> * Revert "this is not right; leading zeros fail" This reverts commit 5997e95e0cb08ffee9cf21a9c82697ed7beb042f. Signed-off-by: Jim O'Regan <[email protected]> * ok, this seems to work Signed-off-by: Jim O'Regan <[email protected]> * drop the tests starting with comma Signed-off-by: Jim O'Regan <[email protected]> * decimal tagger works Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add more test cases Signed-off-by: Jim O'Regan <[email protected]> * lower case Signed-off-by: Jim O'Regan <[email protected]> * add klockan and variants as a prompt, so they are not silently deleted Signed-off-by: Jim O'Regan <[email protected]> * add a very minimal test case for time Signed-off-by: Jim O'Regan <[email protected]> * rewrite with less ambiguity Signed-off-by: Jim O'Regan <[email protected]> * rewrite with less ambiguity, hms Signed-off-by: Jim O'Regan <[email protected]> * add prompt Signed-off-by: Jim O'Regan <[email protected]> * copy the roman handling from es Signed-off-by: Jim O'Regan <[email protected]> * greek letters Signed-off-by: Jim O'Regan <[email protected]> * some fixes to the time tagger Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test runner for time (adapted from es) Signed-off-by: Jim O'Regan <[email protected]> * test runner for time (adapted from es) ((actually adapted)) Signed-off-by: Jim O'Regan <[email protected]> * more work on time Signed-off-by: Jim O'Regan <[email protected]> * |=, not = Signed-off-by: Jim O'Regan <[email protected]> * adapt verbaliser a little Signed-off-by: Jim O'Regan <[email protected]> * add some test cases from module comments Signed-off-by: Jim O'Regan <[email protected]> * export some variables to check Signed-off-by: Jim O'Regan <[email protected]> * small fix Signed-off-by: Jim O'Regan <[email protected]> * comment some stuff that needs major changes Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove Signed-off-by: Jim O'Regan <[email protected]> * try doing this here Signed-off-by: Jim O'Regan <[email protected]> * Revert "try doing this here" This reverts commit ebdba0e3da5cdde19eae24f268ee0edd21f298b5. Signed-off-by: Jim O'Regan <[email protected]> * fix errors in tests Signed-off-by: Jim O'Regan <[email protected]> * minimal test cases for measure Signed-off-by: Jim O'Regan <[email protected]> * sort|uniq everything, see what the difference is Signed-off-by: Jim O'Regan <[email protected]> * merge different tsvs Signed-off-by: Jim O'Regan <[email protected]> * fix casing to avoid conflicts Signed-off-by: Jim O'Regan <[email protected]> * export some variables for testing Signed-off-by: Jim O'Regan <[email protected]> * add a test case Signed-off-by: Jim O'Regan <[email protected]> * need an en/ett split here too Signed-off-by: Jim O'Regan <[email protected]> * fix decimal subgraph Signed-off-by: Jim O'Regan <[email protected]> * remove todo, I've just done it Signed-off-by: Jim O'Regan <[email protected]> * remove missing integer test, does not work elsewhere Signed-off-by: Jim O'Regan <[email protected]> * remove unused imports Signed-off-by: Jim O'Regan <[email protected]> * include greek letters in maths Signed-off-by: Jim O'Regan <[email protected]> * include greek here too Signed-off-by: Jim O'Regan <[email protected]> * minor sg/pl Signed-off-by: Jim O'Regan <[email protected]> * dedup Signed-off-by: Jim O'Regan <[email protected]> * fix a test case Signed-off-by: Jim O'Regan <[email protected]> * put these under if, too Signed-off-by: Jim O'Regan <[email protected]> * no; there are no minor neuters, so that is not relevant here Signed-off-by: Jim O'Regan <[email protected]> * remove greek from here, interferes with delimeter Signed-off-by: Jim O'Regan <[email protected]> * export variables to see what is happening Signed-off-by: Jim O'Regan <[email protected]> * fix some test cases Signed-off-by: Jim O'Regan <[email protected]> * here is one error Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * put ensure_space in graph_utils Signed-off-by: Jim O'Regan <[email protected]> * handle cases where unit follows amount Signed-off-by: Jim O'Regan <[email protected]> * export a variable Signed-off-by: Jim O'Regan <[email protected]> * add a tesst case Signed-off-by: Jim O'Regan <[email protected]> * remove unused imports Signed-off-by: Jim O'Regan <[email protected]> * . is not a cardinal separator Signed-off-by: Jim O'Regan <[email protected]> * fix case Signed-off-by: Jim O'Regan <[email protected]> * add yen Signed-off-by: Jim O'Regan <[email protected]> * final fixes Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo Signed-off-by: Jim O'Regan <[email protected]> * fix typo Signed-off-by: Jim O'Regan <[email protected]> * remove English roman tagger Signed-off-by: Jim O'Regan <[email protected]> * add tokenize_and_classify_lm.py (adapted from en) Signed-off-by: Jim O'Regan <[email protected]> * remove some unused pieces Signed-off-by: Jim O'Regan <[email protected]> * add tokenize_and_classify_with_audio.py (adapted from en) Signed-off-by: Jim O'Regan <[email protected]> * add test pieces for audio (recopied from es) Signed-off-by: Jim O'Regan <[email protected]> * add audio test (adapted from es) Signed-off-by: Jim O'Regan <[email protected]> * in non-deterministic mode, generate both en and ett Signed-off-by: Jim O'Regan <[email protected]> * add very minimal non-deterministic test Signed-off-by: Jim O'Regan <[email protected]> * remove commented pieces/things that will not be used Signed-off-by: Jim O'Regan <[email protected]> * warnings about missing whitelist Signed-off-by: Jim O'Regan <[email protected]> * add sv Signed-off-by: Jim O'Regan <[email protected]> * remove commented pieces/things that will not be used Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * some Riksdag specific titles Signed-off-by: Jim O'Regan <[email protected]> * add my copyright to the other files with non-trivial changes Signed-off-by: Jim O'Regan <[email protected]> * fix year Signed-off-by: Jim O'Regan <[email protected]> * add Swedish support in pynini_export Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add Swedish support for sparrowhark tests -- untested (: Signed-off-by: Jim O'Regan <[email protected]> * address codeql comments Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change decade to year; sparrowhawk enforces categories Signed-off-by: Jim O'Regan <[email protected]> * shoehorn this stuff into the overly narrow sparrowhawk classes Signed-off-by: Jim O'Regan <[email protected]> * Revert "shoehorn this stuff into the overly narrow sparrowhawk classes" This reverts commit a3cf3d5de1702366b2bf9c12ebf7e5d26634c688. Signed-off-by: Jim O'Regan <[email protected]> * read out the AM/PM words, they are not read as letters anyway Signed-off-by: Jim O'Regan <[email protected]> * change date verbaliser to manage isolated decades Signed-off-by: Jim O'Regan <[email protected]> * redo changes to get rid of 'prompt' for 'klockan' Signed-off-by: Jim O'Regan <[email protected]> * remove broken duplicate Signed-off-by: Jim O'Regan <[email protected]> * add a test case Signed-off-by: Jim O'Regan <[email protected]> * add a case for hours without minutes (which should not happen) Signed-off-by: Jim O'Regan <[email protected]> * time tests now pass Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add a time test case that also passes here, but not in sparrowhawk Signed-off-by: Jim O'Regan <[email protected]> * fix error in dates, add more tests Signed-off-by: Jim O'Regan <[email protected]> * import delete_preserve_order Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeql feedback Signed-off-by: Jim O'Regan <[email protected]> * add some ambiguous abbreviations for non-deterministic mode (more as a demonstration than anything deeply useful) Signed-off-by: Jim O'Regan <[email protected]> * move to the correct subdirectory Signed-off-by: Jim O'Regan <[email protected]> * add swedish Signed-off-by: Jim O'Regan <[email protected]> * remove pynini checks Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix error with 1000 in non-deterministic cases Signed-off-by: Jim O'Regan <[email protected]> * fix here also Signed-off-by: Jim O'Regan <[email protected]> * also generate a string of digits if not deterministic Signed-off-by: Jim O'Regan <[email protected]> * add a date case Signed-off-by: Jim O'Regan <[email protected]> * remove duplication Signed-off-by: Jim O'Regan <[email protected]> * boost n_tagged Signed-off-by: Jim O'Regan <[email protected]> * also copyright this year Signed-off-by: Jim O'Regan <[email protected]> * 1500 only fixes one, boost again Signed-off-by: Jim O'Regan <[email protected]> * 2500 does nothing, going to -1 Signed-off-by: Jim O'Regan <[email protected]> * remove audio normalisation; I do not have the time to get this working right now, and there was a bunch more to do for it anyway Signed-off-by: Jim O'Regan <[email protected]> * Revert "remove audio normalisation; I do not have the time to get this working right now, and there was a bunch more to do for it anyway" This reverts commit 383a096083061b0c79457e815a65e55563c7ac74. Signed-off-by: Jim O'Regan <[email protected]> * try setting a low weight to everything non-default Signed-off-by: Jim O'Regan <[email protected]> * put n_tagged back to 500 Signed-off-by: Jim O'Regan <[email protected]> * remove unused import Signed-off-by: Jim O'Regan <[email protected]> * days of the week Signed-off-by: Jim O'Regan <[email protected]> * add more abbreviations Signed-off-by: Jim O'Regan <[email protected]> * setting weights did not work for cardinals, but it did push the test from taking 11 minutes to something more than 40. Re-reverting Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo Signed-off-by: Jim O'Regan <[email protected]> * fix typo Signed-off-by: Jim O'Regan <[email protected]> * remove blank line Signed-off-by: Jim O'Regan <[email protected]> * forgot to remove this piece in the merge conflict Signed-off-by: Jim O'Regan <[email protected]> * remove erroneously added copyright notice Signed-off-by: Jim O'Regan <[email protected]> * change year of copyright in empty files, they aren't eligible anyway Signed-off-by: Jim O'Regan <[email protected]> * add __init__.py in a few places it was missing Signed-off-by: Jim O'Regan <[email protected]> * add the google notice required by the incoming contributing document Signed-off-by: Jim O’Regan <[email protected]> --------- Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * CI setup (#25) * restarting ci Signed-off-by: ekmb <[email protected]> * restarting ci _cr Signed-off-by: ekmb <[email protected]> * revert setup tool Signed-off-by: ekmb <[email protected]> * remove pytest-runner from setup.py Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * fix jenkins Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> * update test dir Signed-off-by: ekmb <[email protected]> --------- Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Merge EN riva release 22.10 (#26) * Merge EN riva release 22.10 Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Code cleanup Signed-off-by: Anand Joseph <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Signed-off-by: anand-nv <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Eng TN - update urls to handle dictionary words (#27) * wip el words Signed-off-by: ekmb <[email protected]> * wip el words Signed-off-by: ekmb <[email protected]> * wip Signed-off-by: ekmb <[email protected]> * electronic pass Signed-off-by: ekmb <[email protected]> * test pass Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * remove unused imports Signed-off-by: ekmb <[email protected]> * add deterministic option normalized options Signed-off-by: ekmb <[email protected]> * update jenkins grammar folder Signed-off-by: ekmb <[email protected]> * clean up, update for SH Signed-off-by: ekmb <[email protected]> * update jenkins dir Signed-off-by: ekmb <[email protected]> * clean up Signed-off-by: ekmb <[email protected]> * reduce cardinal graph Signed-off-by: ekmb <[email protected]> * jenkins dir Signed-off-by: ekmb <[email protected]> * add weight for sh Signed-off-by: ekmb <[email protected]> --------- Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Tn en astronomical no (#28) * Add support for large numbers (>999,999,999,999,999) Signed-off-by: Anand Joseph <[email protected]> * Update cache folder in Jenkinsfile Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Increase mem size for CI tests Signed-off-by: Anand Joseph <[email protected]> * Updating shmem for docker to deal with memory overflow Signed-off-by: Anand Joseph <[email protected]> * Ensure large au cardinal graph is used only if deterministic Signed-off-by: Anand Joseph <[email protected]> * Make comma mandatory in cardinals Signed-off-by: Anand Joseph <[email protected]> * Run FST cache generation and Pytests in separate stages Signed-off-by: Anand Joseph <[email protected]> * Fix stage Signed-off-by: Anand Joseph <[email protected]> * Change cache folder Signed-off-by: Anand Joseph <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Add whitelist param to ITN (#30) * add whitelist param to itn Signed-off-by: ekmb <[email protected]> * add whitelist to export Signed-off-by: ekmb <[email protected]> * update docstrings Signed-off-by: ekmb <[email protected]> --------- Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Eng tn itn (#31) * Add additional units and plurals Signed-off-by: Anand Joseph <[email protected]> * Add support for financial periods (1H22, 2Q19) Signed-off-by: Anand Joseph <[email protected]> * Add missing plural for "gigabit per second" Signed-off-by: Anand Joseph <[email protected]> * Fix for measures Signed-off-by: Anand Joseph <[email protected]> * Use environment variables to set path of fst cache Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix environment variable Signed-off-by: Anand Joseph <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Fix parse "None" as string (#33) * Fix parse "None" as string Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * read double digits for telephone grammar (#32) * read double digits for telephone grammar Signed-off-by: Larisa Kempbell <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * import zero graph instead of hard coding Signed-off-by: Larisa Kempbell <[email protected]> --------- Signed-off-by: Larisa Kempbell <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Install (#35) * remove conda pynini install Signed-off-by: Yang Zhang <[email protected]> * added pynini install note Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix text Signed-off-by: Yang Zhang <[email protected]> --------- Signed-off-by: Yang Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Install (#36) * remove conda pynini install Signed-off-by: Yang Zhang <[email protected]> * added pynini install note Signed-off-by: Yang Zhang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix text Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> --------- Signed-off-by: Yang Zhang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * 0.1.6rc0 (#37) Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Add ci (#39) * Add additional languages to CI Pipeline Signed-off-by: Anand Joseph <[email protected]> * Fix Jenkinsfile Signed-off-by: Anand Joseph <[email protected]> * Add missing 'ar' in lang options Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing 'ar' in normalize.py Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Correct name of verbalizer far Signed-off-by: Anand Joseph <[email protected]> * Run language tests in stages Signed-off-by: Anand Joseph <[email protected]> * Update DE cache folder Signed-off-by: Anand Joseph <[email protected]> * Add VI, RU, SV CI tests Signed-off-by: Anand Joseph <[email protected]> * Fix misssing bracket, add ZH Signed-off-by: Anand Joseph <[email protected]> * Use non-deterministic TN for RU Signed-off-by: Anand Joseph <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * support the use of phonetic superscript letters for ordinals, because there are maniacs on the internet who think that because you can, you should (#41) Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Alex Cui <[email protected]> * update fr cache path for ci (#44) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Alex Cui <[email protected]> * update ITN to work after Punctuation capitalization model (#22) * add cases with capitalization, cardinal, decimal pass Signed-off-by: ekmb <[email protected]> * fix telephone, ordinal Signed-off-by: ekmb <[email protected]> * restarting ci Signed-off-by: ekmb <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restarting ci Signed-off-by: ekmb <[email protected]> * restarting ci Signed-off-by: ekmb <[email protected]> * restarting ci Signed-off-by: ekmb <[email protected]> * update electronic Signed-off-by: ekmb <[email protected]> * review feedback, update whitelist Signed-off-by: ekmb <[email protected]> * rename capitalize func Signed-off-by: ekmb <[email protected]> * fix SH tests Signed-off-by: ekmb <[email protected]> * fix tests Signed-off-by: ekmb <[email protected]> * update jenkins folder name Signed-off-by: ekmb <[email protected]> * added cased arg to ITN Signed-off-by: ekmb <[email protected]> * add input_case arg to other lang Signed-off-by: ekmb <[email protected]> * jenkins dirs update Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * update test Signed-off-by: ekmb <[email protected]> * fix codeql errors Signed-off-by: ekmb <[email protected]> * fix sh Signed-off-by: ekmb <[email protected]> * review Signed-off-by: ekmb <[email protected]> * update jenkins dir Signed-off-by: ekmb <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix default value Signed-off-by: ekmb <[email protected]> --------- Signed-off-by: ekmb <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * En names (#42) * Add support for Financial year and for years between 1000 BC and 1000AD Signed-off-by: Anand Joseph <[email protected]> * Add support for product names and add abbreviations to whitelist Signed-off-by: Anand Joseph <[email protected]> * Add weights for some sequences, exclude 'a' before numeric sequence Signed-off-by: Anand Joseph <[email protected]> * Add tests Signed-off-by: Anand Joseph <[email protected]> * Update cache folder for EN Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update FR Cache path Signed-off-by: Anand Joseph <[email protected]> * Move text to TSV files, and some code cleanup Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add additional vocabulary, allow singular usage of units to support adjective phrases Signed-off-by: Anand Joseph <[email protected]> * Fix issue with whitelist loader not handling weights correctly Move cased loader file to graph_utils Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * insert space between value and unit Signed-off-by: Anand Joseph <[email protected]> * Insert space between measurement and unit. Adjust weight for ordinal Signed-off-by: Anand Joseph <[email protected]> * Update tests Signed-off-by: Anand Joseph <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * update doc and fix alignment for itn (#47) * save Signed-off-by: Yang Zhang <[email protected]> * save Signed-off-by: Yang Zhang <[email protected]> * extend alignment for itn Signed-off-by: Yang Zhang <[email protected]> --------- Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Align ci test (#51) * added jenkins tests for aligment Signed-off-by: Yang Zhang <[email protected]> * added test to pr doc Signed-off-by: Yang Zhang <[email protected]> * fix ci test Signed-off-by: Yang Zhang <[email protected]> * fix ci test Signed-off-by: Yang Zhang <[email protected]> * fix ci test Signed-off-by: Yang Zhang <[email protected]> * fix ci Signed-off-by: Yang Zhang <[email protected]> * fix ci Signed-off-by: Yang Zhang <[email protected]> * fix ci Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> * fix Signed-off-by: Yang Zhang <[email protected]> --------- Signed-off-by: Yang Zhang <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Audio-based TN for Swedish (#49) * Audio-based TN for Swedish, for Språkbanken Tal Replaces #48 Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Updating cache directory (Not entirely sure what the pattern is) Signed-off-by: Jim O’Regan <[email protected]> * Delete tokenize_and_classify_lm.py Signed-off-by: Jim O’Regan <[email protected]> * fraction fix from ITN branch Signed-off-by: Jim O'Regan <[email protected]> --------- Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * fix sv tests (#52) Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * 0.1.7 release (#53) Signed-off-by: ekmb <[email protected]> Signed-off-by: Alex Cui <[email protected]> * En names (#56) * Rename "period" tag to "text" tag for date to avoid changes to sparrowhawk proto Signed-off-by: Anand Joseph <[email protected]> * Update Jenkinsfile Signed-off-by: anand-nv <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Signed-off-by: anand-nv <[email protected]> Signed-off-by: Alex Cui <[email protected]> * fix bug for hh:mm:ss normalization (#57) Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Mariana <[email protected]> Signed-off-by: Alex Cui <[email protected]> * rewrite regex to silence deprecation warning (#55) Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Hungarian TN ✅ (#9) * additional exports from cardinal Signed-off-by: Jim O'Regan <[email protected]> * add inflection for quantities Signed-off-by: Jim O'Regan <[email protected]> * add a test case Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * enable decimal Signed-off-by: Jim O'Regan <[email protected]> * change integer Signed-off-by: Jim O'Regan <[email protected]> * fixes to verbaliser for decimal Signed-off-by: Jim O'Regan <[email protected]> * more test cases Signed-off-by: Jim O'Regan <[email protected]> * add superessive forms (powers of) Signed-off-by: Jim O'Regan <[email protected]> * superscript to superessive Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add vowels Signed-off-by: Jim O'Regan <[email protected]> * add vowels Signed-off-by: Jim O'Regan <[email protected]> * fix var Signed-off-by: Jim O'Regan <[email protected]> * bare minimum electronic test Signed-off-by: Jim O'Regan <[email protected]> * add another test case Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add a symbol Signed-off-by: Jim O'Regan <[email protected]> * add incomplete time tagger (partially adapted from de) Signed-off-by: Jim O'Regan <[email protected]> * fix error with some inflected abbreviations Signed-off-by: Jim O'Regan <[email protected]> * add some alternative measure forms Signed-off-by: Jim O'Regan <[email protected]> * hour, minute, second; whichever is last can be inflected Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test runner for time Signed-off-by: Jim O'Regan <[email protected]> * add very minimal time test Signed-off-by: Jim O'Regan <[email protected]> * will want cardinal here Signed-off-by: Jim O'Regan <[email protected]> * add inflection for things like GBP, where inflection is based on pé Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * docstring Signed-off-by: Jim O'Regan <[email protected]> * move two letters Signed-off-by: Jim O'Regan <[email protected]> * add my copyright Signed-off-by: Jim O'Regan <[email protected]> * partially adapted number tagger (adapted from de) Signed-off-by: Jim O'Regan <[email protected]> * small changes Signed-off-by: Jim O'Regan <[email protected]> * add unadapted measure tagger (from de) Signed-off-by: Jim O'Regan <[email protected]> * other ways of reading w Signed-off-by: Jim O'Regan <[email protected]> * for non deterministic, a bunch of these symbols can be read as letters Signed-off-by: Jim O'Regan <[email protected]> * currency Signed-off-by: Jim O'Regan <[email protected]> * more inflection Signed-off-by: Jim O'Regan <[email protected]> * get the abbreviation expanded as letters for non-deterministic Signed-off-by: Jim O'Regan <[email protected]> * working now, add a comment Signed-off-by: Jim O'Regan <[email protected]> * also integer, and preserve order Signed-off-by: Jim O'Regan <[email protected]> * also accept the full words Signed-off-by: Jim O'Regan <[email protected]> * deduplicate Signed-off-by: Jim O'Regan <[email protected]> * reorder to make a bit more sense Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * explicitly make tuples elsewhere; this works from what I see of the function output, but not in the resulting fst Signed-off-by: Jim O'Regan <[email protected]> * adapt comments Signed-off-by: Jim O'Regan <[email protected]> * commenting out weighted part makes this work Signed-off-by: Jim O'Regan <[email protected]> * duplicate space Signed-off-by: Jim O'Regan <[email protected]> * partially adapted money verbaliser Signed-off-by: Jim O'Regan <[email protected]> * actually saving the adaptations Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add time_zone data (copy from de) Signed-off-by: Jim O'Regan <[email protected]> * delete commented code, irrelevant here Signed-off-by: Jim O'Regan <[email protected]> * small modifications, still thinking about how to tackle this Signed-off-by: Jim O'Regan <[email protected]> * add missing __init__.py Signed-off-by: Jim O'Regan <[email protected]> * change year of copyright in empty files, they aren't eligible anyway Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix missing tabs Signed-off-by: Jim O'Regan <[email protected]> * remove pynini checks from tests Signed-off-by: Jim O'Regan <[email protected]> * remove unused import Signed-off-by: Jim O'Regan <[email protected]> * fix typo Signed-off-by: Jim O'Regan <[email protected]> * uncomment everything. yolo. Signed-off-by: Jim O'Regan <[email protected]> * add verbaliser for measure (unadapted from de) Signed-off-by: Jim O'Regan <[email protected]> * add verbaliser for telephone (unadapted from de) Signed-off-by: Jim O'Regan <[email protected]> * add verbaliser for time (unadapted from de) Signed-off-by: Jim O'Regan <[email protected]> * uncomment everything. yolo. Signed-off-by: Jim O'Regan <[email protected]> * fix cache dir Signed-off-by: Jim O'Regan <[email protected]> * tagger for telephone (copy from sv) Signed-off-by: Jim O'Regan <[email protected]> * add basic tests (native verified) Signed-off-by: Jim O'Regan <[email protected]> * add components for read digits Signed-off-by: Jim O'Regan <[email protected]> * add an example with a different separator Signed-off-by: Jim O'Regan <[email protected]> * start adapting Signed-off-by: Jim O'Regan <[email protected]> * add 2-digit area codes Signed-off-by: Jim O'Regan <[email protected]> * add another Signed-off-by: Jim O'Regan <[email protected]> * add Bp to area codes, no need to be that specific Signed-off-by: Jim O'Regan <[email protected]> * export var Signed-off-by: Jim O'Regan <[email protected]> * in progress Signed-off-by: Jim O'Regan <[email protected]> * country codes Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy/paste errors abound Signed-off-by: Jim O'Regan <[email protected]> * put in a function rather than duplicate Signed-off-by: Jim O'Regan <[email protected]> * nominal digits Signed-off-by: Jim O'Regan <[email protected]> * add IP prompt Signed-off-by: Jim O'Regan <[email protected]> * add google copyright notice; probably meaningless Signed-off-by: Jim O'Regan <[email protected]> * more work on telephone Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused import Signed-off-by: Jim O'Regan <[email protected]> * fix path Signed-off-by: Jim O'Regan <[email protected]> * minor adaptation; more needed Signed-off-by: Jim O'Regan <[email protected]> * replace time verbaliser with version from sv Signed-off-by: Jim O'Regan <[email protected]> * adapt more Signed-off-by: Jim O'Regan <[email protected]> * nearly there Signed-off-by: Jim O'Regan <[email protected]> * replace with version from sv Signed-off-by: Jim O'Regan <[email protected]> * extend tests Signed-off-by: Jim O'Regan <[email protected]> * some tweaks Signed-off-by: Jim O'Regan <[email protected]> * add an IP test Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add a couple more ordinal tests Signed-off-by: Jim O'Regan <[email protected]> * move variables Signed-off-by: Jim O'Regan <[email protected]> * filter ordinals Signed-off-by: Jim O'Regan <[email protected]> * basic fraction tests Signed-off-by: Jim O'Regan <[email protected]> * . and / both clash, so only make year optional if it is not deterministic Signed-off-by: Jim O'Regan <[email protected]> * using the other word for two, that test cannot pass Signed-off-by: Jim O'Regan <[email protected]> * numerator and denominator can compound; qdd minus Signed-off-by: Jim O'Regan <[email protected]> * form fractionals in ordinal, because something about bare_ordinals does not work when exported Signed-off-by: Jim O'Regan <[email protected]> * add another test, including spaces Signed-off-by: Jim O'Regan <[email protected]> * works in the repl, not in reality Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * copy fraction symbols from es Signed-off-by: Jim O'Regan <[email protected]> * copy two lines from es to handle faction symbols Signed-off-by: Jim O'Regan <[email protected]> * add a test for that Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * extend Signed-off-by: Jim O'Regan <[email protected]> * ah, I was forgetting to delete preserve order Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add pieces from swedish itn, adapted Signed-off-by: Jim O'Regan <[email protected]> * add a function to give from/to minutes for 15/30/45 subdivision Signed-off-by: Jim O'Regan <[email protected]> * add functions, but some pieces came from ITN, so are backwards Signed-off-by: Jim O'Regan <[email protected]> * ok, should change the quarter word to a cardinal, or something Signed-off-by: Jim O'Regan <[email protected]> * swapping order Signed-off-by: Jim O'Regan <[email protected]> * more swapping Signed-off-by: Jim O'Regan <[email protected]> * remove import Signed-off-by: Jim O'Regan <[email protected]> * add an example Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change some things Signed-off-by: Jim O'Regan <[email protected]> * some things fixed Signed-off-by: Jim O'Regan <[email protected]> * more adjustments to time Signed-off-by: Jim O'Regan <[email protected]> * more todo, but working for this subset Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more time Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * missing endings Signed-off-by: Jim O'Regan <[email protected]> * sort|uniq Signed-off-by: Jim O'Regan <[email protected]> * timezone can be inflected too Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add sparrowhark test (todo) Signed-off-by: Jim O'Regan <[email protected]> * add test_cases_word (copy from sv) Signed-off-by: Jim O'Regan <[email protected]> * add some word cases with Hungarian accents Signed-off-by: Jim O'Regan <[email protected]> * add Hungarian to Jenkinsfile. This may cause much distress and wailing and gnashing of teeth Signed-off-by: Jim O'Regan <[email protected]> * fix the commented ITN part Signed-off-by: Jim O'Regan <[email protected]> * add hu Signed-off-by: Jim O'Regan <[email protected]> * basic test cases for the last two parts Signed-off-by: Jim O'Regan <[email protected]> * fix measure cardinals Signed-off-by: Jim O'Regan <[email protected]> * a couple more tests, last still not working Signed-off-by: Jim O'Regan <[email protected]> * missed removing preserver_order Signed-off-by: Jim O'Regan <[email protected]> * fix test Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused imports Signed-off-by: Jim O'Regan <[email protected]> * codeql Signed-off-by: Jim O'Regan <[email protected]> * codeql Signed-off-by: Jim O'Regan <[email protected]> * comment the variables I may wish to use later (codeql) Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix decimals Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * incorporate feedback from @Laszlo-Weber Signed-off-by: Jim O'Regan <[email protected]> * bare minimum tests + fix verbaliser Signed-off-by: Jim O'Regan <[email protected]> * add öre (also for NOK) Signed-off-by: Jim O’Regan <[email protected]> * Comment line, for now Signed-off-by: Jim O’Regan <[email protected]> * try breaking this into pieces Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing __init__.py Signed-off-by: Jim O'Regan <[email protected]> * add missing __init__.py Signed-off-by: Jim O'Regan <[email protected]> * revert 0c6823e111a876495702d347cf7b347106388ed4 Signed-off-by: Jim O'Regan <[email protected]> * fix a bug in cardinal graph Signed-off-by: Jim O'Regan <[email protected]> * at no point is 000 being deleted; probably why the tests are weird Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert a0d031a861fcd7b5750027f2887f3344f39b6616 Signed-off-by: Jim O'Regan <[email protected]> * add more spaced alternatives to the non-deterministic cases Signed-off-by: Jim O'Regan <[email protected]> * add the hyphen before or-ing with 000 Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change money handling to keep sparrowhawk happy Signed-off-by: Jim O'Regan <[email protected]> * add [be]os_or_space Signed-off-by: Jim O'Regan <[email protected]> * try just rewriting the offending pieces to see if they are coming from here Signed-off-by: Jim O'Regan <[email protected]> * add extra spaced versions Signed-off-by: Jim O'Regan <[email protected]> * add extra spaced versions Signed-off-by: Jim O'Regan <[email protected]> * Revert "try just rewriting the offending pieces to see if they are coming from here" This reverts commit bc06b1162703354fe7bd5efaff7f58ed981d81c0. Signed-off-by: Jim O'Regan <[email protected]> * add extra spaced versions Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * try here Signed-off-by: Jim O'Regan <[email protected]> * Ok... seems to not be happening here either Revert "try here" This reverts commit 801c5f1c28d234c8b47a1d6f52f662b909fbb1c2. Signed-off-by: Jim O'Regan <[email protected]> * try moving a test to see if that makes any difference Signed-off-by: Jim O'Regan <[email protected]> * try duplicating to see if it fails twice Signed-off-by: Jim O'Regan <[email protected]> * Ok, fails both times Revert "try duplicating to see if it fails twice" This reverts commit 908cddc7b3453fb2deaaa201881a304c853a746a. Signed-off-by: Jim O'Regan <[email protected]> * 1 fails in some places, 2 in others, so add 2 here and see if that also fails Signed-off-by: Jim O'Regan <[email protected]> * see if this makes a difference Signed-off-by: Jim O'Regan <[email protected]> * It does not Revert "see if this makes a difference" This reverts commit dacc61281c4efbfd2d5ce1e91386cbd234392d28. Signed-off-by: Jim O'Regan <[email protected]> * rewrite regex to silence deprecation warning Signed-off-by: Jim O'Regan <[email protected]> * REVERTME: change to see what is happening Signed-off-by: Jim O'Regan <[email protected]> * that missing bracket cannot have been good Signed-off-by: Jim O'Regan <[email protected]> * no difference, try just deleting leading zero Signed-off-by: Jim O'Regan <[email protected]> * try again Signed-off-by: Jim O'Regan <[email protected]> * move that thing, merge some lines Signed-off-by: Jim O'Regan <[email protected]> * at least it fails quickly Signed-off-by: Jim O'Regan <[email protected]> * export original Signed-off-by: Jim O'Regan <[email protected]> * move things around for no real reason Signed-off-by: Jim O'Regan <[email protected]> * add in the clean_cardinal from the tutorial Signed-off-by: Jim O'Regan <[email protected]> * Revert "add in the clean_cardinal from the tutorial" This reverts commit 4f06c885a0bfe1acc183c3560d88c4e2e76574ac. Signed-off-by: Jim O'Regan <[email protected]> * try this again Signed-off-by: Jim O'Regan <[email protected]> * pretty sure this should work. As should the other Signed-off-by: Jim O'Regan <[email protected]> * comment the ugly kludges to make them easier to remove. They do not work anyway Signed-off-by: Jim O'Regan <[email protected]> * ok, try here Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "rewrite regex to silence deprecation warning" This reverts commit b8a923db57d27c5b3353be4b85bac9efb6e2d220. Signed-off-by: Jim O'Regan <[email protected]> * Revert "REVERTME: change to see what is happening" This reverts commit c73e4ef384ec29a2b5877dac9d4fe617a5c681b6. Signed-off-by: Jim O'Regan <[email protected]> * remove unused imports Signed-off-by: Jim O'Regan <[email protected]> * export unfiltered version of cardinal graph Signed-off-by: Jim O'Regan <[email protected]> * change the variable names Signed-off-by: Jim O'Regan <[email protected]> * get rid of duplicate input print Signed-off-by: Jim O'Regan <[email protected]> * BUGHUNT: check if string has been escaped Signed-off-by: Jim O'Regan <[email protected]> * changing variable, because I am getting tired of looking at that overly long name Signed-off-by: Jim O'Regan <[email protected]> * try deleting the normaliser to see if that makes any difference Signed-off-by: Jim O'Regan <[email protected]> * Revert "BUGHUNT: check if string has been escaped" This reverts commit 70f83241d47b0c73fa41e395eee193cc1685e056. Signed-off-by: Jim O'Regan <[email protected]> * Revert "try deleting the normaliser to see if that makes any difference" This reverts commit 78f4ded93375308dadb9b5e247f030da2efbecb5. Signed-off-by: Jim O'Regan <[email protected]> * moving globals into __init__ fixes the problem Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_sparrowhawk_normalization.sh Signed-off-by: Jim O’Regan <[email protected]> * prompt: is not part of the ontology sparrowhawk recognises Signed-off-by: Jim O'Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * these two now conflict Signed-off-by: Jim O'Regan <[email protected]> * rearrange slightly Signed-off-by: Jim O'Regan <[email protected]> * Update telephone.py remove unused import Signed-off-by: Jim O’Regan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * Es bugfix (#59) * improve shortest path for decimals and currency Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * fix sh tn test files for telephone Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * replace non-breaking space Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * improve ambiguous test cases Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * refine weights for decimal Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * improve testing when there are multiple shortest paths Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * revert ES TN for measures with mixed fractions Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * fix formatting Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * add comment for testing multiple shortest paths Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> --------- Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Signed-off-by: Mariana <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Store input_case in Normalizer (#65) Signed-off-by: Ryan <[email protected]> Signed-off-by: Alex Cui <[email protected]> * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <[email protected]> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <[email protected]> * whitespace fixes Signed-off-by: Jim O'Regan <[email protected]> * also fix in the verbaliser Signed-off-by: Jim O'Regan <[email protected]> * Update Jenkinsfile Signed-off-by: Jim O’Regan <[email protected]> --------- Signed-off-by: Jim O'Regan <[email protected]> Signed-off-by: Jim O’Regan <[email protected]> Signed-off-by: Alex Cui <[email protected]> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <[email protected]> Signed-off-by: Alex Cui <[email protected]> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <[email protected]> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <[email protected]> * Remove unused imports Signed-off-by: anand-nv <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <[email protected]> --------- Signed-off-by: ealbasiri <[email protected]> Signed-off-by: anand-nv <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <[email protected]> Signed-off-by: Alex Cui <[email protected]> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * add inits Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> --------- Signed-off-by: Mariana Graterol Fuenmayor <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.no…
- Loading branch information