Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fr tn #90

Closed
wants to merge 101 commits into from
Closed

Fr tn #90

wants to merge 101 commits into from

Conversation

mgrafu
Copy link
Collaborator

@mgrafu mgrafu commented Jul 18, 2023

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Before your PR is "Ready for review"

Pre checks:

  • Have you signed your commits? Use git commit -s to sign.
  • Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • Remove import guards (try import: ... except: ...) if not already done.
  • If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

mgrafu and others added 26 commits January 24, 2023 18:22
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

dependabot bot and others added 3 commits July 18, 2023 10:02
Bumps [setuptools](https://github.com/pypa/setuptools) from 59.5.0 to 65.5.1.
- [Release notes](https://github.com/pypa/setuptools/releases)
- [Changelog](https://github.com/pypa/setuptools/blob/main/CHANGES.rst)
- [Commits](pypa/setuptools@v59.5.0...v65.5.1)

---
updated-dependencies:
- dependency-name: setuptools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* fix AB norm for non-eng, add args description

Signed-off-by: ekmb <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* style

Signed-off-by: ekmb <[email protected]>

Signed-off-by: ekmb <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: ekmb <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
ekmb and others added 28 commits July 18, 2023 10:15
Signed-off-by: ekmb <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: ekmb <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* Rename "period" tag to "text" tag for date to avoid changes to sparrowhawk proto

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* additional exports from cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* add inflection for quantities

Signed-off-by: Jim O'Regan <[email protected]>

* add a test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* enable decimal

Signed-off-by: Jim O'Regan <[email protected]>

* change integer

Signed-off-by: Jim O'Regan <[email protected]>

* fixes to verbaliser for decimal

Signed-off-by: Jim O'Regan <[email protected]>

* more test cases

Signed-off-by: Jim O'Regan <[email protected]>

* add superessive forms (powers of)

Signed-off-by: Jim O'Regan <[email protected]>

* superscript to superessive

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add vowels

Signed-off-by: Jim O'Regan <[email protected]>

* add vowels

Signed-off-by: Jim O'Regan <[email protected]>

* fix var

Signed-off-by: Jim O'Regan <[email protected]>

* bare minimum electronic test

Signed-off-by: Jim O'Regan <[email protected]>

* add another test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add a symbol

Signed-off-by: Jim O'Regan <[email protected]>

* add incomplete time tagger (partially adapted from de)

Signed-off-by: Jim O'Regan <[email protected]>

* fix error with some inflected abbreviations

Signed-off-by: Jim O'Regan <[email protected]>

* add some alternative measure forms

Signed-off-by: Jim O'Regan <[email protected]>

* hour, minute, second; whichever is last can be inflected

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add test runner for time

Signed-off-by: Jim O'Regan <[email protected]>

* add very minimal time test

Signed-off-by: Jim O'Regan <[email protected]>

* will want cardinal here

Signed-off-by: Jim O'Regan <[email protected]>

* add inflection for things like GBP, where inflection is based on pé

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* docstring

Signed-off-by: Jim O'Regan <[email protected]>

* move two letters

Signed-off-by: Jim O'Regan <[email protected]>

* add my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* partially adapted number tagger (adapted from de)

Signed-off-by: Jim O'Regan <[email protected]>

* small changes

Signed-off-by: Jim O'Regan <[email protected]>

* add unadapted measure tagger (from de)

Signed-off-by: Jim O'Regan <[email protected]>

* other ways of reading w

Signed-off-by: Jim O'Regan <[email protected]>

* for non deterministic, a bunch of these symbols can be read as letters

Signed-off-by: Jim O'Regan <[email protected]>

* currency

Signed-off-by: Jim O'Regan <[email protected]>

* more inflection

Signed-off-by: Jim O'Regan <[email protected]>

* get the abbreviation expanded as letters for non-deterministic

Signed-off-by: Jim O'Regan <[email protected]>

* working now, add a comment

Signed-off-by: Jim O'Regan <[email protected]>

* also integer, and preserve order

Signed-off-by: Jim O'Regan <[email protected]>

* also accept the full words

Signed-off-by: Jim O'Regan <[email protected]>

* deduplicate

Signed-off-by: Jim O'Regan <[email protected]>

* reorder to make a bit more sense

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* explicitly make tuples elsewhere; this works from what I see of the function output, but not in the resulting fst

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* commenting out weighted part makes this work

Signed-off-by: Jim O'Regan <[email protected]>

* duplicate space

Signed-off-by: Jim O'Regan <[email protected]>

* partially adapted money verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* actually saving the adaptations

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add time_zone data (copy from de)

Signed-off-by: Jim O'Regan <[email protected]>

* delete commented code, irrelevant here

Signed-off-by: Jim O'Regan <[email protected]>

* small modifications, still thinking about how to tackle this

Signed-off-by: Jim O'Regan <[email protected]>

* add missing __init__.py

Signed-off-by: Jim O'Regan <[email protected]>

* change year of copyright in empty files, they aren't eligible anyway

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix missing tabs

Signed-off-by: Jim O'Regan <[email protected]>

* remove pynini checks from tests

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused import

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment everything. yolo.

Signed-off-by: Jim O'Regan <[email protected]>

* add verbaliser for measure (unadapted from de)

Signed-off-by: Jim O'Regan <[email protected]>

* add verbaliser for telephone (unadapted from de)

Signed-off-by: Jim O'Regan <[email protected]>

* add verbaliser for time (unadapted from de)

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment everything. yolo.

Signed-off-by: Jim O'Regan <[email protected]>

* fix cache dir

Signed-off-by: Jim O'Regan <[email protected]>

* tagger for telephone (copy from sv)

Signed-off-by: Jim O'Regan <[email protected]>

* add basic tests (native verified)

Signed-off-by: Jim O'Regan <[email protected]>

* add components for read digits

Signed-off-by: Jim O'Regan <[email protected]>

* add an example with a different separator

Signed-off-by: Jim O'Regan <[email protected]>

* start adapting

Signed-off-by: Jim O'Regan <[email protected]>

* add 2-digit area codes

Signed-off-by: Jim O'Regan <[email protected]>

* add another

Signed-off-by: Jim O'Regan <[email protected]>

* add Bp to area codes, no need to be that specific

Signed-off-by: Jim O'Regan <[email protected]>

* export var

Signed-off-by: Jim O'Regan <[email protected]>

* in progress

Signed-off-by: Jim O'Regan <[email protected]>

* country codes

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* copy/paste errors abound

Signed-off-by: Jim O'Regan <[email protected]>

* put in a function rather than duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* nominal digits

Signed-off-by: Jim O'Regan <[email protected]>

* add IP prompt

Signed-off-by: Jim O'Regan <[email protected]>

* add google copyright notice; probably meaningless

Signed-off-by: Jim O'Regan <[email protected]>

* more work on telephone

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused import

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* minor adaptation; more needed

Signed-off-by: Jim O'Regan <[email protected]>

* replace time verbaliser with version from sv

Signed-off-by: Jim O'Regan <[email protected]>

* adapt more

Signed-off-by: Jim O'Regan <[email protected]>

* nearly there

Signed-off-by: Jim O'Regan <[email protected]>

* replace with version from sv

Signed-off-by: Jim O'Regan <[email protected]>

* extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add an IP test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add a couple more ordinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* move variables

Signed-off-by: Jim O'Regan <[email protected]>

* filter ordinals

Signed-off-by: Jim O'Regan <[email protected]>

* basic fraction tests

Signed-off-by: Jim O'Regan <[email protected]>

* . and / both clash, so only make year optional if it is not deterministic

Signed-off-by: Jim O'Regan <[email protected]>

* using the other word for two, that test cannot pass

Signed-off-by: Jim O'Regan <[email protected]>

* numerator and denominator can compound; qdd minus

Signed-off-by: Jim O'Regan <[email protected]>

* form fractionals in ordinal, because something about bare_ordinals does not work when exported

Signed-off-by: Jim O'Regan <[email protected]>

* add another test, including spaces

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, not in reality

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* copy fraction symbols from es

Signed-off-by: Jim O'Regan <[email protected]>

* copy two lines from es to handle faction symbols

Signed-off-by: Jim O'Regan <[email protected]>

* add a test for that

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* ah, I was forgetting to delete preserve order

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add pieces from swedish itn, adapted

Signed-off-by: Jim O'Regan <[email protected]>

* add a function to give from/to minutes for 15/30/45 subdivision

Signed-off-by: Jim O'Regan <[email protected]>

* add functions, but some pieces came from ITN, so are backwards

Signed-off-by: Jim O'Regan <[email protected]>

* ok, should change the quarter word to a cardinal, or something

Signed-off-by: Jim O'Regan <[email protected]>

* swapping order

Signed-off-by: Jim O'Regan <[email protected]>

* more swapping

Signed-off-by: Jim O'Regan <[email protected]>

* remove import

Signed-off-by: Jim O'Regan <[email protected]>

* add an example

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change some things

Signed-off-by: Jim O'Regan <[email protected]>

* some things fixed

Signed-off-by: Jim O'Regan <[email protected]>

* more adjustments to time

Signed-off-by: Jim O'Regan <[email protected]>

* more todo, but working for this subset

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more time

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* missing endings

Signed-off-by: Jim O'Regan <[email protected]>

* sort|uniq

Signed-off-by: Jim O'Regan <[email protected]>

* timezone can be inflected too

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add sparrowhark test (todo)

Signed-off-by: Jim O'Regan <[email protected]>

* add test_cases_word (copy from sv)

Signed-off-by: Jim O'Regan <[email protected]>

* add some word cases with Hungarian accents

Signed-off-by: Jim O'Regan <[email protected]>

* add Hungarian to Jenkinsfile. This may cause much distress and wailing and gnashing of teeth

Signed-off-by: Jim O'Regan <[email protected]>

* fix the commented ITN part

Signed-off-by: Jim O'Regan <[email protected]>

* add hu

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases for the last two parts

Signed-off-by: Jim O'Regan <[email protected]>

* fix measure cardinals

Signed-off-by: Jim O'Regan <[email protected]>

* a couple more tests, last still not working

Signed-off-by: Jim O'Regan <[email protected]>

* missed removing preserver_order

Signed-off-by: Jim O'Regan <[email protected]>

* fix test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* codeql

Signed-off-by: Jim O'Regan <[email protected]>

* codeql

Signed-off-by: Jim O'Regan <[email protected]>

* comment the variables I may wish to use later (codeql)

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix decimals

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* incorporate feedback from @Laszlo-Weber

Signed-off-by: Jim O'Regan <[email protected]>

* bare minimum tests + fix verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add öre (also for NOK)

Signed-off-by: Jim O’Regan <[email protected]>

* Comment line, for now

Signed-off-by: Jim O’Regan <[email protected]>

* try breaking this into pieces

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing __init__.py

Signed-off-by: Jim O'Regan <[email protected]>

* add missing __init__.py

Signed-off-by: Jim O'Regan <[email protected]>

* revert 0c6823e

Signed-off-by: Jim O'Regan <[email protected]>

* fix a bug in cardinal graph

Signed-off-by: Jim O'Regan <[email protected]>

* at no point is 000 being deleted; probably why the tests are weird

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* revert a0d031a

Signed-off-by: Jim O'Regan <[email protected]>

* add more spaced alternatives to the non-deterministic cases

Signed-off-by: Jim O'Regan <[email protected]>

* add the hyphen before or-ing with 000

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change money handling to keep sparrowhawk happy

Signed-off-by: Jim O'Regan <[email protected]>

* add [be]os_or_space

Signed-off-by: Jim O'Regan <[email protected]>

* try just rewriting the offending pieces to see if they are coming from here

Signed-off-by: Jim O'Regan <[email protected]>

* add extra spaced versions

Signed-off-by: Jim O'Regan <[email protected]>

* add extra spaced versions

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "try just rewriting the offending pieces to see if they are coming from here"

This reverts commit bc06b11.

Signed-off-by: Jim O'Regan <[email protected]>

* add extra spaced versions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try here

Signed-off-by: Jim O'Regan <[email protected]>

* Ok... seems to not be happening here either

Revert "try here"

This reverts commit 801c5f1.

Signed-off-by: Jim O'Regan <[email protected]>

* try moving a test to see if that makes any difference

Signed-off-by: Jim O'Regan <[email protected]>

* try duplicating to see if it fails twice

Signed-off-by: Jim O'Regan <[email protected]>

* Ok, fails both times

Revert "try duplicating to see if it fails twice"

This reverts commit 908cddc.

Signed-off-by: Jim O'Regan <[email protected]>

* 1 fails in some places, 2 in others, so add 2 here and see if that also fails

Signed-off-by: Jim O'Regan <[email protected]>

* see if this makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* It does not

Revert "see if this makes a difference"

This reverts commit dacc612.

Signed-off-by: Jim O'Regan <[email protected]>

* rewrite regex to silence deprecation warning

Signed-off-by: Jim O'Regan <[email protected]>

* REVERTME: change to see what is happening

Signed-off-by: Jim O'Regan <[email protected]>

* that missing bracket cannot have been good

Signed-off-by: Jim O'Regan <[email protected]>

* no difference, try just deleting leading zero

Signed-off-by: Jim O'Regan <[email protected]>

* try again

Signed-off-by: Jim O'Regan <[email protected]>

* move that thing, merge some lines

Signed-off-by: Jim O'Regan <[email protected]>

* at least it fails quickly

Signed-off-by: Jim O'Regan <[email protected]>

* export original

Signed-off-by: Jim O'Regan <[email protected]>

* move things around for no real reason

Signed-off-by: Jim O'Regan <[email protected]>

* add in the clean_cardinal from the tutorial

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "add in the clean_cardinal from the tutorial"

This reverts commit 4f06c88.

Signed-off-by: Jim O'Regan <[email protected]>

* try this again

Signed-off-by: Jim O'Regan <[email protected]>

* pretty sure this should work. As should the other

Signed-off-by: Jim O'Regan <[email protected]>

* comment the ugly kludges to make them easier to remove. They do not work anyway

Signed-off-by: Jim O'Regan <[email protected]>

* ok, try here

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "rewrite regex to silence deprecation warning"

This reverts commit b8a923d.

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "REVERTME: change to see what is happening"

This reverts commit c73e4ef.

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* export unfiltered version of cardinal graph

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of duplicate input print

Signed-off-by: Jim O'Regan <[email protected]>

* BUGHUNT: check if string has been escaped

Signed-off-by: Jim O'Regan <[email protected]>

* changing variable, because I am getting tired of looking at that overly long name

Signed-off-by: Jim O'Regan <[email protected]>

* try deleting the normaliser to see if that makes any difference

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "BUGHUNT: check if string has been escaped"

This reverts commit 70f8324.

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "try deleting the normaliser to see if that makes any difference"

This reverts commit 78f4ded.

Signed-off-by: Jim O'Regan <[email protected]>

* moving globals into __init__ fixes the problem

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_sparrowhawk_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

* prompt: is not part of the ontology sparrowhawk recognises

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* these two now conflict

Signed-off-by: Jim O'Regan <[email protected]>

* rearrange slightly

Signed-off-by: Jim O'Regan <[email protected]>

* Update telephone.py

remove unused import

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* improve shortest path for decimals and currency

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix sh tn test files for telephone

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* replace non-breaking space

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* improve ambiguous test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine weights for decimal

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* improve testing when there are multiple shortest paths

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* revert ES TN for measures with mixed fractions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix formatting

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add comment for testing multiple shortest paths

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Ryan <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfd.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb748.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
for more information, see https://pre-commit.ci

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
@mgrafu mgrafu closed this Jul 18, 2023
@ekmb ekmb deleted the FR_TN branch May 22, 2024 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.