Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zh itn #74

Merged
merged 99 commits into from
Jun 30, 2023
Merged

Zh itn #74

merged 99 commits into from
Jun 30, 2023

Conversation

BuyuanCui
Copy link
Collaborator

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Before your PR is "Ready for review"

Pre checks:

  • [ *] Have you signed your commits? Use git commit -s to sign.
  • [ *] Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • [ *] If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • [ *] Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • [ *] Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • [ *] Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • [ *] If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • [ *] Remove import guards (try import: ... except: ...) if not already done.
  • [ *] If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • [ *] Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • [* ] New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

alexcui-nvidia and others added 30 commits February 8, 2023 20:00
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
…he files due to format issues

Signed-off-by: BuyuanCui <[email protected]>
… removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
…ing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>
BuyuanCui and others added 4 commits June 29, 2023 15:20
Signed-off-by: BuyuanCui <[email protected]>
Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Copy link

@fayejf fayejf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! LGTM! Thanks!

@fayejf fayejf merged commit cf62bb8 into main Jun 30, 2023
@fayejf fayejf deleted the zh_itn branch June 30, 2023 02:39
gayu-thri pushed a commit to gayu-thri/NeMo-text-processing that referenced this pull request Jun 30, 2023
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: gayu-thri <[email protected]>
gayu-thri pushed a commit to gayu-thri/NeMo-text-processing that referenced this pull request Jul 3, 2023
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: gayu-thri <[email protected]>
mgrafu pushed a commit that referenced this pull request Jul 18, 2023
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
BuyuanCui added a commit that referenced this pull request Dec 12, 2023
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Feb 16, 2024
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
ekmb added a commit that referenced this pull request Apr 30, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style. 

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <690…
BuyuanCui added a commit that referenced this pull request Jul 12, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Jul 25, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
tbartley94 added a commit that referenced this pull request Aug 16, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: Gia…
BuyuanCui added a commit that referenced this pull request Aug 20, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Aug 20, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated …
BuyuanCui added a commit that referenced this pull request Sep 19, 2024
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 19, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 19, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated …
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated …
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Sep 26, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated …
BuyuanCui added a commit that referenced this pull request Oct 16, 2024
* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Oct 16, 2024
* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated cases for SH tests

Signed-off-by: Alex Cui <[email protected]>

* updated cases

Signed-off-by: Alex Cui <[email protected]>

* added some sentences

Signed-off-by: Alex Cui <[email protected]>

* test cases update

Signed-off-by: Alex Cui <[email protected]>

* solving rebase issue, repushing changes

Signed-off-by: Alex Cui <[email protected]>

* resolving conflict

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixings according to ci

Signed-off-by: Alex Cui <[email protected]>

* fixings according to the ci

Signed-off-by: Alex Cui <[email protected]>

* removed not used

Signed-off-by: Alex Cui <[email protected]>

* notused removing

Signed-off-by: Alex Cui <[email protected]>

* format issue

Signed-off-by: Alex Cui <[email protected]>

* formt issue

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* removing unused files

Signed-off-by: Alex Cui <[email protected]>

* remiving unsed files;

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* removing unsed files

Signed-off-by: Alex Cui <[email protected]>

* added sentences as test cases

Signed-off-by: Alex Cui <[email protected]>

* added senetnces as test cases

Signed-off-by: Alex Cui <[email protected]>

* removed commentyed out tests

Signed-off-by: Alex Cui <[email protected]>

* updating dates

Signed-off-by: Alex Cui <[email protected]>

* attemps to fix bug

Signed-off-by: Alex Cui <[email protected]>

* inprocess of fixing the bug

Signed-off-by: Alex Cui <[email protected]>

* fixing existing issue

Signed-off-by: Alex Cui <[email protected]>

* updated graph_utils, tokenize and classify, and word graphs

Signed-off-by: Alex Cui <[email protected]>

* added bacl the ppostprocessor far creation

Signed-off-by: Alex Cui <[email protected]>

* updated NEMO_NOT_ALPHA as a new variable

Signed-off-by: Alex Cui <[email protected]>

* far files

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* combiedn into measure

Signed-off-by: Alex Cui <[email protected]>

* removing and combined to meaasure

Signed-off-by: Alex Cui <[email protected]>

* removing, not used

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to fix space issue

Signed-off-by: Alex Cui <[email protected]>

* updates to solve the space issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh issue

Signed-off-by: Alex Cui <[email protected]>

* resolving sh test issue

Signed-off-by: Alex Cui <[email protected]>

* adding anands updates

Signed-off-by: Alex Cui <[email protected]>

* data updated for measure and whitelist

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* updates

Signed-off-by: Alex Cui <[email protected]>

* removing fraction and math part

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* removing preprocessor, updating measure, adding shitelist cases

Signed-off-by: Alex Cui <[email protected]>

* removing processor, modification for sp test, shitelist and word

Signed-off-by: Alex Cui <[email protected]>

* updating zh date

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* realized itn being cvommented out, adding back

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* trying to run zh tn separately because it takes long time to run

Signed-off-by: Alex Cui <[email protected]>

* modification to ru zh tn separately

Signed-off-by: Alex Cui <[email protected]>

* independent zh tnitn tests for more time

Signed-off-by: Alex Cui <[email protected]>

* adding lines to save far file

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates for reducing testing time

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* for ounct graph

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing used graphs

Signed-off-by: Alex Cui <[email protected]>

* format and removing used comments

Signed-off-by: Alex Cui <[email protected]>

* removing this one, not used

Signed-off-by: Alex Cui <[email protected]>

* remove unused commentss�

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing unsed comments

Signed-off-by: Alex Cui <[email protected]>

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* Delete tools/text_processing_deployment/zh directory

Removing far files.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* updates according to the github comments

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removing comments

Signed-off-by: Alex Cui <[email protected]>

* punct grammar

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_cases_cardinal.txt

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Dockerfile

Copied from main branch ( which included Anand's updates)

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update launch.sh

Found differences in the file. Fixing it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Saw word ITN being commented out. Adding it back.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update money.py

Found cardinal grammar not accepting suffix. Fixed it.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update Jenkinsfile

Removed duplicated zh test from line 230s

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update utils.py

Addressing bug raised in bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update graph_utils.py

Addressing bug in graph_utils.py of zh ITN and decimal tagger of ar TN #162.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Fixing code style, removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update measure.py

Removing unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update post_processing.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

Removing unused import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing unused imports

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cardinal.py

Deleting unused graph

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

Removing import pynini

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update word.py

removing pynini import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update verbalize.py

removing pynutil import

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update post_processing.py

removing punct graph imported

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_sparrowhawk_normalization.sh

Update on test issue for Docker file locations

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_ordinal.py

Fixing style.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/taggers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Delete nemo_text_processing/text_normalization/zh/verbalizers/math_symbol.py

Removing because it's not one of the semiotic classes.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

Updating Jenkins date

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Vitaly Lavrukhin <[email protected]>
Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Evelina <[email protected]>
Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Enno Hermann <[email protected]>
Co-authored-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Enas Albasiri <[email protected]>
Co-authored-by: anand-nv <[email protected]>
Co-authored-by: Mariana <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: lleaver <[email protected]>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Jim O’Regan <[email protected]>
Co-authored-by: Giacomo Leone Maria Cavallini <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Peter Plantinga <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>
BuyuanCui added a commit that referenced this pull request Oct 16, 2024
* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix broken path for nondet whitelist (#124)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Increase weights for serial (en TN) (#128)

* Increase weights for serial (en TN)

Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126

Signed-off-by: anand-nv <[email protected]>

* Add tests for fix

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile cache path

Signed-off-by: anand-nv <[email protected]>

* Update Jenkinsfile. Fix cache folder

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add measures file for FR TN (#131)

* add measures file

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update whitelist data

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh jenkins (#127)

* Add SH tests to Jenkins

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkins tests

Signed-off-by: Anand Joseph <[email protected]>

* Add CI/CD tests for sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* docker build only if in test mode

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing variable

Signed-off-by: Anand Joseph <[email protected]>

* Fix comments and remove arguments not required

Signed-off-by: Anand Joseph <[email protected]>

* Fix commands not executing

Signed-off-by: Anand Joseph <[email protected]>

* Missing arguments

Signed-off-by: Anand Joseph <[email protected]>

* Missing quotes

Signed-off-by: Anand Joseph <[email protected]>

* Fix incorrect path for tests

Signed-off-by: Anand Joseph <[email protected]>

* Fix paths

Signed-off-by: Anand Joseph <[email protected]>

* Incorrect paths of tests and shunit2

Signed-off-by: Anand Joseph <[email protected]>

* Fix issues with paths as arguments to shunit

Signed-off-by: Anand Joseph <[email protected]>

* Undo path change

Signed-off-by: Anand Joseph <[email protected]>

* Fix intentional fail test

Signed-off-by: Anand Joseph <[email protected]>

* revert redundant check for cased option

Signed-off-by: Anand Joseph <[email protected]>

* Fix default path in export_grammars.sh

Signed-off-by: Anand Joseph <[email protected]>

* Update cache paths

Signed-off-by: Anand Joseph <[email protected]>

* Add interactive option

Signed-off-by: Anand Joseph <[email protected]>

* Add SH tests for cased EN ITN

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* update isort - fix precommit (#138)

* update isort version

Signed-off-by: Evelina <[email protected]>

* update isort version

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian itn (#136)

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* Added Armenian ITN

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* Added context for tests and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* Revert "Added context for tests and fixed CodeQL errors"

This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b.

Signed-off-by: David Sargsyan <[email protected]>

* Added context to some test files and fixed CodeQL errors

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* deleted unnecessary data

Signed-off-by: David Sargsyan <[email protected]>

* translated a few measurements to Armenian

Signed-off-by: David Sargsyan <[email protected]>

* adjusted some things for better readability and maintainer support

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed one test case and some issues

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Fix CI (#142)

* fix whitelist deployment

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* comment out tests to recreate grammars

Signed-off-by: Evelina <[email protected]>

* shorten test

Signed-off-by: Evelina <[email protected]>

* fix jenkins

Signed-off-by: Evelina <[email protected]>

* cased for TN

Signed-off-by: Evelina <[email protected]>

* revert debug changes

Signed-off-by: Evelina <[email protected]>

* fix args default

Signed-off-by: Evelina <[email protected]>

* try parallel

Signed-off-by: Evelina <[email protected]>

* debug parallel

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* rerun

Signed-off-by: Evelina <[email protected]>

* fix sh tests for local SH launcher

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

* enable all ci tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Armenian TN (#137)

* merged with main branch and fixed conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing conflicts

Signed-off-by: David Sargsyan <[email protected]>

* fixing some more conflicts

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: David Sargsyan <[email protected]>

* fixed a minor issue

Signed-off-by: David Sargsyan <[email protected]>

* deleted unused imports

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix: add "hy" language option for armenian

Signed-off-by: Ara Yeroyan <[email protected]>

* added optional space for measurements after cardinals/decimals

Signed-off-by: David Sargsyan <[email protected]>

* added Armenian dot

Signed-off-by: David Sargsyan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: David Sargsyan <[email protected]>
Signed-off-by: Ara Yeroyan <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Co-authored-by: David Sargsyan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ara Yeroyan <[email protected]>
Co-authored-by: tbartley94 <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Marathi ITN (#134)

* Added Marathi ITN

Signed-off-by: Chinmay Patil <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adding jenkins test

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Chinmay Patil <[email protected]>
Signed-off-by: tbartley94 <[email protected]>
Signed-off-by: Travis Bartley <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tbartley94 <[email protected]>
Co-authored-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins fix (#150)

* jenkins fix

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* removing armenian to troubleshoot jenkins

Signed-off-by: Travis Bartley <[email protected]>

* missing _init_ for python

Signed-off-by: Travis Bartley <[email protected]>

* mislabled cache

Signed-off-by: Travis Bartley <[email protected]>

---------

Signed-off-by: Travis Bartley <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* r0.3.0 release (#151)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Fix text=line[text] to text=line[text_field] (#153)

Signed-off-by: Sasha Meister <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* use real string on docstring (#157)

Signed-off-by: Kevin Sanders <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Sh postprocess (#147)

* Add support for postprocessor far in sparrowhawk

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Choose between having a post processor or not

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* update run_evaluate script for cased itn (#164)

* update run_evaluate script for cased itn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* remove unused function from ar tn decimals (#165)

* remove unused function from ar tn decimals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* ZH sentence-level TN (#112)

* Swedish telephone fix (#60)

* port fix for telephone from swedish-itn branch

Signed-off-by: Jim O'Regan <[email protected]>

* extend cardinal in non-deterministic mode

Signed-off-by: Jim O'Regan <[email protected]>

* whitespace fixes

Signed-off-by: Jim O'Regan <[email protected]>

* also fix in the verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* Update Jenkinsfile

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* log instead of print in graph_utils.py (#68)

Signed-off-by: Enno Hermann <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* CER estimation speedup for audio-based text normalization (#73)

* Replaced jiwer with editdistance to speed up CER estimation

Signed-off-by: Vitaly Lavrukhin <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Vitaly Lavrukhin <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add measure coverage for TN and ITN (#62)

* add measure coverage for TN and ITN

Signed-off-by: ealbasiri <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* Remove unused imports

Signed-off-by: anand-nv <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update measure.py

Signed-off-by: anand-nv <[email protected]>

---------

Signed-off-by: ealbasiri <[email protected]>
Signed-off-by: anand-nv <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63)

* upload es-ES and fr-FR g2p dicts

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add inits

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add NALA Spanish dict

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* rename Spanish and French dictionaries

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add Italian dictionary

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* add country codes from hu (#77)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix electronic case for username (#75)

* fix electronic username w/o .

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* fix ar test

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable sv tests

Signed-off-by: Evelina <[email protected]>

* update ci dirs, enable sv tests

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* 0.1.8 release (#79)

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Codeswitched ES/EN ITN  (#78)

* Initial commit for ES-EN codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Enable export for es_en codeswitched ITN

Signed-off-by: Anand Joseph <[email protected]>

* Add whitelist, update weights

Signed-off-by: Anand Joseph <[email protected]>

* Add tests for en_es, zone tagged separately in es

Signed-off-by: Anand Joseph <[email protected]>

* Fix path to test data for sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update Jenkinsfile - enable ES/EN tests

Signed-off-by: Anand Joseph <[email protected]>

* Add __init__.py files

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix issues with failed docker build - due to archiving of debian and issues with re2

Signed-off-by: Anand Joseph <[email protected]>

* Remove unused imports and variables

Signed-off-by: Anand Joseph <[email protected]>

* Update date

Signed-off-by: Anand Joseph <[email protected]>

* Enable NBSP in sparrowhawk tests

Signed-off-by: Anand Joseph <[email protected]>

* Update copyrights

Signed-off-by: Anand Joseph <[email protected]>

* Update cache path in for ES/EN CI/CD

Signed-off-by: Anand Joseph <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* minor normalize.py edit for usability (#84)

* electronic verbalizer fallback (#81)

* 0.1.8 release

Signed-off-by: Evelina <[email protected]>

* add elec fallback

Signed-off-by: Evelina <[email protected]>

* update ci

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Linnea Pari Leaver <[email protected]>

* documentation edits for grammar/clarity

Signed-off-by: Linnea Pari Leaver <[email protected]>

* added --output_field flag for command line interface

Signed-off-by: Linnea Pari Leaver <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Linnea Pari Leaver <[email protected]>
Co-authored-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Linnea Pari Leaver <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Swedish ITN (#40)

* force two digits for month

Signed-off-by: Jim O'Regan <[email protected]>

* put it in a function, because I reject the garbage pre-commit.ci came up with

Signed-off-by: Jim O'Regan <[email protected]>

* wrap some more pieces

Signed-off-by: Jim O'Regan <[email protected]>

* add graph pieces

Signed-off-by: Jim O'Regan <[email protected]>

* delete junk

Signed-off-by: Jim O'Regan <[email protected]>

* my copyright

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser (copy from es)

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks

Signed-off-by: Jim O'Regan <[email protected]>

* add date verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* add right tokens

Signed-off-by: Jim O'Regan <[email protected]>

* some tweaks, more needed

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to ITN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* tweaks to TN date tagger

Signed-off-by: Jim O'Regan <[email protected]>

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* moved to tagger

Signed-off-by: Jim O'Regan <[email protected]>

* nothing actually fixed here

Signed-off-by: Jim O'Regan <[email protected]>

* now most tests pass

Signed-off-by: Jim O'Regan <[email protected]>

* electronic

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fractions

Signed-off-by: Jim O'Regan <[email protected]>

* extend

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bare fractions is a bit of an overreach

Signed-off-by: Jim O'Regan <[email protected]>

* whitelist

Signed-off-by: Jim O'Regan <[email protected]>

* just inverting the TN whitelist tagger will not work/be useful

Signed-off-by: Jim O'Regan <[email protected]>

* copy from English

Signed-off-by: Jim O'Regan <[email protected]>

* overwrite with version from en

Signed-off-by: Jim O'Regan <[email protected]>

* add basic test case

Signed-off-by: Jim O'Regan <[email protected]>

* fix call

Signed-off-by: Jim O'Regan <[email protected]>

* swap tsv sides

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* add optional_era variable

Signed-off-by: Jim O'Regan <[email protected]>

* add test case

Signed-off-by: Jim O'Regan <[email protected]>

* make deterministic default, like most of the others

Signed-off-by: Jim O'Regan <[email protected]>

* also add lowercase versions

Signed-off-by: Jim O'Regan <[email protected]>

* replacing NEMO_SPACE does not work either

Signed-off-by: Jim O'Regan <[email protected]>

* increasing weight... did not work last time

Signed-off-by: Jim O'Regan <[email protected]>

* tweaking test cases, in case it was a sentence splitting issue. It was not

Signed-off-by: Jim O'Regan <[email protected]>

* put the full stops back

Signed-off-by: Jim O'Regan <[email protected]>

* add filler words

Signed-off-by: Jim O'Regan <[email protected]>

* try splitting this out to see if it makes a difference

Signed-off-by: Jim O'Regan <[email protected]>

* aha, this part should be non-deterministic only

Signed-off-by: Jim O'Regan <[email protected]>

* single line only

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "increasing weight... did not work last time"

This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996.

Signed-off-by: Jim O'Regan <[email protected]>

* disabling ITN here makes TN work again(?)

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "disabling ITN here makes TN work again(?)"

This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f.

Signed-off-by: Jim O'Regan <[email protected]>

* changing the variable name fixes norm tests

Signed-off-by: Jim O'Regan <[email protected]>

* change the variable names

Signed-off-by: Jim O'Regan <[email protected]>

* add missing test tooling

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* copy telephone fixes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* add a piece for area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add country codes from hu

Signed-off-by: Jim O'Regan <[email protected]>

* extend any_read_digit for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* country/area codes for ITN

Signed-off-by: Jim O'Regan <[email protected]>

* first attempt

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* add to t&c

Signed-off-by: Jim O'Regan <[email protected]>

* remove country codes for the time being, makes things ambiguous

Signed-off-by: Jim O'Regan <[email protected]>

* basic test cases

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove trailing whitespace

Signed-off-by: Jim O'Regan <[email protected]>

* Update __init__.py

Signed-off-by: Jim O’Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* basic transform of TN tests

Signed-off-by: Jim O'Regan <[email protected]>

* basic transformation of TN decimal tests

Signed-off-by: Jim O'Regan <[email protected]>

* slight changes to date

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* include space

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen

Signed-off-by: Jim O'Regan <[email protected]>

* problem with tusen was not that

Signed-off-by: Jim O'Regan <[email protected]>

* add functions from hu

Signed-off-by: Jim O'Regan <[email protected]>

* respect my own copyright xD

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading, this has been an oddity before

Signed-off-by: Jim O'Regan <[email protected]>

* try changing this year declaration

Signed-off-by: Jim O'Regan <[email protected]>

* add year + era

Signed-off-by: Jim O'Regan <[email protected]>

* eliminate more module-level data loading

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "eliminate more module-level data loading"

This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a.

Signed-off-by: Jim O'Regan <[email protected]>

* expose variables

Signed-off-by: Jim O'Regan <[email protected]>

* extra param for itn mode

Signed-off-by: Jim O'Regan <[email protected]>

* change call

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* change comment

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* fix parens

Signed-off-by: Jim O'Regan <[email protected]>

* move data loading

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* adapt comments

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adapt/extend tests

Signed-off-by: Jim O'Regan <[email protected]>

* fix dict init/change keys to something useful

Signed-off-by: Jim O'Regan <[email protected]>

* initial stab at prefixed numbers

Signed-off-by: Jim O'Regan <[email protected]>

* some adapting

Signed-off-by: Jim O'Regan <[email protected]>

* insert kl. if absent

Signed-off-by: Jim O'Regan <[email protected]>

* fix comments

Signed-off-by: Jim O'Regan <[email protected]>

* the relative prefixed times

Signed-off-by: Jim O'Regan <[email protected]>

* + comments

Signed-off-by: Jim O'Regan <[email protected]>

* enable time

Signed-off-by: Jim O'Regan <[email protected]>

* space in both directions

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix hours to

Signed-off-by: Jim O'Regan <[email protected]>

* split by before/after

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* fix if

Signed-off-by: Jim O'Regan <[email protected]>

* kl. 9

Signed-off-by: Jim O'Regan <[email protected]>

* copy from en

Signed-off-by: Jim O'Regan <[email protected]>

* keep only get_abs_path

Signed-off-by: Jim O'Regan <[email protected]>

* imports

Signed-off-by: Jim O'Regan <[email protected]>

* add trimmed file

Signed-off-by: Jim O'Regan <[email protected]>

* fix imports

Signed-off-by: Jim O'Regan <[email protected]>

* two abs_paths... could be fun

Signed-off-by: Jim O'Regan <[email protected]>

* minutes/seconds

Signed-off-by: Jim O'Regan <[email protected]>

* suffix

Signed-off-by: Jim O'Regan <[email protected]>

* delete, not insert

Signed-off-by: Jim O'Regan <[email protected]>

* one optional

Signed-off-by: Jim O'Regan <[email protected]>

* export variable

Signed-off-by: Jim O'Regan <[email protected]>

* kl. or one of suffix/zone

Signed-off-by: Jim O'Regan <[email protected]>

* already disambiguated

Signed-off-by: Jim O'Regan <[email protected]>

* closure

Signed-off-by: Jim O'Regan <[email protected]>

* do not insert kl.

Signed-off-by: Jim O'Regan <[email protected]>

* fix test case

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix spelling

Signed-off-by: Jim O'Regan <[email protected]>

* Delete measure.py

Signed-off-by: Jim O’Regan <[email protected]>

* Delete money.py

Signed-off-by: Jim O’Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused pieces

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused test pieces

Signed-off-by: Jim O'Regan <[email protected]>

* copy from es

Signed-off-by: Jim O'Regan <[email protected]>

* add SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* add/update __init__

Signed-off-by: Jim O'Regan <[email protected]>

* blank line

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* fix lang

Signed-off-by: Jim O'Regan <[email protected]>

* fix decimal verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix

Signed-off-by: Jim O'Regan <[email protected]>

* remove year, conflicts with cardinal

Signed-off-by: Jim O'Regan <[email protected]>

* space before, not after

Signed-off-by: Jim O'Regan <[email protected]>

* fix cardinal tests

Signed-off-by: Jim O'Regan <[email protected]>

* spurious deletion

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* re-enable SV TN; enable SV ITN

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "re-enable SV TN; enable SV ITN"

This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b.

Signed-off-by: Jim O'Regan <[email protected]>

* fix singulras

Signed-off-by: Jim O'Regan <[email protected]>

* add an export

Signed-off-by: Jim O'Regan <[email protected]>

* change integer graph

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move spaces

Signed-off-by: Jim O'Regan <[email protected]>

* use cdrewrite

Signed-off-by: Jim O'Regan <[email protected]>

* just EOS/BOS

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jim O'Regan <[email protected]>

* omit en/ett, because they are also articles

Signed-off-by: Jim O'Regan <[email protected]>

* uncomment

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* unused

Signed-off-by: Jim O'Regan <[email protected]>

* strip spaces from decimal part

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* partial fix, not what I wanted

Signed-off-by: Jim O'Regan <[email protected]>

* move comment

Signed-off-by: Jim O'Regan <[email protected]>

* en/ett cannot work in itn case

Signed-off-by: Jim O'Regan <[email protected]>

* be more deliberate in graph construction

Signed-off-by: Jim O'Regan <[email protected]>

* accept both

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* +2 tests

Signed-off-by: Jim O'Regan <[email protected]>

* (try to) accept singular quantities for plurals

Signed-off-by: Jim O'Regan <[email protected]>

* retry

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* oops

Signed-off-by: Jim O'Regan <[email protected]>

* replace

Signed-off-by: Jim O'Regan <[email protected]>

* arcmap

Signed-off-by: Jim O'Regan <[email protected]>

* version without ones

Signed-off-by: Jim O'Regan <[email protected]>

* add another test

Signed-off-by: Jim O'Regan <[email protected]>

* change graph

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of this, this is where it goes wrong

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* add a test

Signed-off-by: Jim O'Regan <[email protected]>

* multiple states from both ones, try removing and readding

Signed-off-by: Jim O'Regan <[email protected]>

* remove ones, see if that fixes at least the bare quantities

Signed-off-by: Jim O'Regan <[email protected]>

* works in the repl, dunno why it still breaks

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove duplicate

Signed-off-by: Jim O'Regan <[email protected]>

* move definition

Signed-off-by: Jim O'Regan <[email protected]>

* simplify

Signed-off-by: Jim O'Regan <[email protected]>

* tweak

Signed-off-by: Jim O'Regan <[email protected]>

* another test

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* local declaration, seems to not be working

Signed-off-by: Jim O'Regan <[email protected]>

* more tests

Signed-off-by: Jim O'Regan <[email protected]>

* match verbaliser

Signed-off-by: Jim O'Regan <[email protected]>

* fix last two failing tests

Signed-off-by: Jim O'Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing tests for telephone and word

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused variable

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused imports

Signed-off-by: Jim O'Regan <[email protected]>

* fix comment

Signed-off-by: Jim O'Regan <[email protected]>

* get rid of convert_space, tests fail

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails

Signed-off-by: Jim O'Regan <[email protected]>

* Revert "put convert_spaces back, change test file; pytest fails"

This reverts commit a7bb7489137b8026aab02aff64df39e874630043.

Signed-off-by: Jim O'Regan <[email protected]>

* put convert_spaces back, change test file; pytest fails, take 2

Signed-off-by: Jim O'Regan <[email protected]>

* deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk

Signed-off-by: Jim O'Regan <[email protected]>

* try converting the non-breaking spaces in the shell script

Signed-off-by: Jim O'Regan <[email protected]>

* wrong place

Signed-off-by: Jim O'Regan <[email protected]>

* fix typo

Signed-off-by: Jim O'Regan <[email protected]>

* fix path

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* export

Signed-off-by: Jim O'Regan <[email protected]>

* remove unused

Signed-off-by: Jim O'Regan <[email protected]>

* Update date.py

Signed-off-by: Jim O’Regan <[email protected]>

* Update time.py

Signed-off-by: Jim O’Regan <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix comment

Signed-off-by: Jim O’Regan <[email protected]>

* trim comments

Signed-off-by: Jim O’Regan <[email protected]>

* remove commented line

Signed-off-by: Jim O’Regan <[email protected]>

* en halv

Signed-off-by: Jim O’Regan <[email protected]>

* Update test_sparrowhawk_inverse_text_normalization.sh

Signed-off-by: Jim O’Regan <[email protected]>

---------

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Jim O’Regan <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Italian_TN (#67)

* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh itn (#74)

* Add ZH ITN

Signed-off-by: Anand Joseph <[email protected]>

* Fix copyrights and code cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Remove invalid tests

Signed-off-by: Anand Joseph <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Resolve CodeQL issues

Signed-off-by: Anand Joseph <[email protected]>

* Cleanup

Signed-off-by: Anand Joseph <[email protected]>

* Fix missing 'zh' option for ITN and correct comment

Signed-off-by: Anand Joseph <[email protected]>

* Update __init__.py

Change to zh instead of en for the imports.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for decimal test data

Signed-off-by: BuyuanCui <[email protected]>

* update for langauge import

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update for Chinese punctuations

Signed-off-by: BuyuanCui <[email protected]>

* a new class for whitelist

Signed-off-by: BuyuanCui <[email protected]>

* PYNINI_AVAILABLE = False

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to file import format issue

Signed-off-by: BuyuanCui <[email protected]>

* recreated due to format issue

Signed-off-by: BuyuanCui <[email protected]>

* caught duplicates, removed

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates, arranges for CHInese Yuan updates

Signed-off-by: BuyuanCui <[email protected]>

* updates accordingly to the comments from last PR. Recreated some of the files due to format issues

Signed-off-by: BuyuanCui <[email protected]>

* removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue

Signed-off-by: BuyuanCui <[email protected]>

* re-added this file to avoid data file import error

Signed-off-by: BuyuanCui <[email protected]>

* updated gramamr according to last PR. Removed the acceptance of 千

Signed-off-by: BuyuanCui <[email protected]>

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updated according to last PR. Removed comma after decimal points

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for Fraction

Signed-off-by: BuyuanCui <[email protected]>

* gramamr for money and updated according to last PR. Plus process of 元

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar. updates due to the updates in cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression

Signed-off-by: BuyuanCui <[email protected]>

* arrangements

Signed-off-by: BuyuanCui <[email protected]>

* added whitelist grammar

Signed-off-by: BuyuanCui <[email protected]>

* word grammar for non-classified items

Signed-off-by: BuyuanCui <[email protected]>

* updated cardinal, decimal, time, itn data

Signed-off-by: BuyuanCui <[email protected]>

* updates according to last PR

Signed-off-by: BuyuanCui <[email protected]>

* updates according to the updates for cardinal grammar

Signed-off-by: BuyuanCui <[email protected]>

* updates for more Mandarin punctuations

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to last PR. removing am pm

Signed-off-by: BuyuanCui <[email protected]>

* adjustment on the weight

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the targger updates

Signed-off-by: BuyuanCui <[email protected]>

* updated accordingly to the time tagger

Signed-off-by: BuyuanCui <[email protected]>

* updates according to changes in tagger on am and pm

Signed-off-by: BuyuanCui <[email protected]>

* verbalizer for fraction

Signed-off-by: BuyuanCui <[email protected]>

* added for mandarin grammar

Signed-off-by: BuyuanCui <[email protected]>

* kept this file because using English utils results in data namin error

Signed-off-by: BuyuanCui <[email protected]>

* merge conflict

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed import os

Signed-off-by: BuyuanCui <[email protected]>

* deleted unsed variables

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* updates and edits based on pr checks

Signed-off-by: BuyuanCui <[email protected]>

* format issue, reccreated

Signed-off-by: BuyuanCui <[email protected]>

* format issue recreated

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed codeing style/format

Signed-off-by: BuyuanCui <[email protected]>

* fixed coding style and format

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicated graph for 毛

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removed the comment

Signed-off-by: BuyuanCui <[email protected]>

* removing unnecessary comments

Signed-off-by: BuyuanCui <[email protected]>

* unnecessary comment removed

Signed-off-by: BuyuanCui <[email protected]>

* test file updated for more cases

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated with a comment explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* updated the file explaining why this file is kept

Signed-off-by: BuyuanCui <[email protected]>

* added Mandarin as zh

Signed-off-by: BuyuanCui <[email protected]>

* removing for dplication

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* removed unused NEMO objects

Signed-off-by: BuyuanCui <[email protected]>

* removed duplicates

Signed-off-by: BuyuanCui <[email protected]>

* removing unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix test file failures

Signed-off-by: BuyuanCui <[email protected]>

* updates to fix file failtures

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failture

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to resolve test case failure

Signed-off-by: BuyuanCui <[email protected]>

* updates to adap to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adapt to grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* updates to adopt to cardinal grammar changes

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fix style

Signed-off-by: BuyuanCui <[email protected]>

* fixing pr checks

Signed-off-by: BuyuanCui <[email protected]>

* removed // for zhtn/itn cache

Signed-off-by: BuyuanCui <[email protected]>

* Update inverse_normalize.py

Added zh as a selection to pass Jenkins checks.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: Alex Cui <[email protected]>
Co-authored-by: Anand Joseph <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* updated pynini_export.py file to create far files (#88)

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* readd Swedish (#87)

Signed-off-by: Jim O'Regan <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn 0712 (#89)

* updates

Signed-off-by: BuyuanCui <[email protected]>

* updates and fixings according to document on natonal gideline

Signed-off-by: BuyuanCui <[email protected]>

* Decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* fraction updated

Signed-off-by: BuyuanCui <[email protected]>

* money updated

Signed-off-by: BuyuanCui <[email protected]>

* ordinal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* punctuation grammar added

Signed-off-by: BuyuanCui <[email protected]>

* time gramamr updated

Signed-off-by: BuyuanCui <[email protected]>

* tokenizaer updated

Signed-off-by: BuyuanCui <[email protected]>

* updates on certificate

Signed-off-by: BuyuanCui <[email protected]>

* data updated and added due to updates and chanegs to the existing grammar

Signed-off-by: BuyuanCui <[email protected]>

* cardinal updated

Signed-off-by: BuyuanCui <[email protected]>

* date grammar changed

Signed-off-by: BuyuanCui <[email protected]>

* decimal grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar updated

Signed-off-by: BuyuanCui <[email protected]>

* grammar added

Signed-off-by: BuyuanCui <[email protected]>

* grammar updates

Signed-off-by: BuyuanCui <[email protected]>

* test data added

Signed-off-by: BuyuanCui <[email protected]>

* test python file edits

Signed-off-by: BuyuanCui <[email protected]>

* updates for tn1.0 and previous tn grammar from contribution

Signed-off-by: BuyuanCui <[email protected]>

* test cases updated

Signed-off-by: BuyuanCui <[email protected]>

* coding style fixed

Signed-off-by: BuyuanCui <[email protected]>

* dates updated for init files

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated the date for zh

Signed-off-by: BuyuanCui <[email protected]>

* removed unsed imports

Signed-off-by: BuyuanCui <[email protected]>

* removed comments

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added back the itn tests

Signed-off-by: BuyuanCui <[email protected]>

* added back measure and math from previou TN

Signed-off-by: BuyuanCui <[email protected]>

* updated for tests reruns

Signed-off-by: BuyuanCui <[email protected]>

* updats

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* updated weights

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Zh tn char (#95)

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name change

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* file name

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* code stle

Signed-off-by: BuyuanCui <[email protected]>

* fixed import error

Signed-off-by: BuyuanCui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* audio-based TN fix for empty pred_text/text (#92)

* fix for empty pred_text

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add unittests

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix path

Signed-off-by: Evelina <[email protected]>

* fix path

Signed-off-by: Evelina <[email protected]>

* fix pytest

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* pip 1.2.0

Signed-off-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* French tn (#91)

* add tests for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add fr tn for cardinals, decimals, fractions and ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete it far files from tools

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add languages to run_evaluate

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* remove ambiguous spacing

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* enable sh testing for fr tn

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile cache date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix test for ordinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update tn cache for fr

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* resolve codeql issues

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Add whitelist_tech.tsv (#96)

Signed-off-by: Anand Joseph <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Zhitn 0727 (#93)

* updates on itn grammar to pass sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updats for sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* updates fro sparrowhawk tests

Signed-off-by: BuyuanCui <[email protected]>

* coding style fix

Signed-off-by: BuyuanCui <[email protected]>

* updates for coding style and sparrowhawk test

Signed-off-by: BuyuanCui <[email protected]>

* updated classes for tests on whitelist and word grammar

Signed-off-by: BuyuanCui <[email protected]>

* added for tests on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added for test on word

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on whitelist

Signed-off-by: BuyuanCui <[email protected]>

* added to run test on word

Signed-off-by: BuyuanCui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_word.py

Removed unused import.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_word.py

Removed imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Removing imports according to CodeQL

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update test_whitelist.py

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

* Update Jenkinsfile

changed zh cache to 07-27-23 as it is the latest update.

Signed-off-by: Buyuan(Alex) Cui <[email protected]>

---------

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Buyuan(Alex) Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Es tn romans fix (#98)

* fix es tn roman exceptions

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update jenkinsfile

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update eval script for ITN

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* codeql fix

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Change docker image (#102)

Change docker image to one including sparrowhawk

Signed-off-by: anand-nv <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Print warning instead exception (#97)

* raise text

Signed-off-by: Nikolay Karpov <[email protected]>

* text arg

Signed-off-by: Nikolay Karpov <[email protected]>

* Failed text

Signed-off-by: Nikolay Karpov <[email protected]>

* add logger

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* logger

Signed-off-by: Nikolay Karpov <[email protected]>

* NeMo-text-processing

Signed-off-by: Nikolay Karpov <[email protected]>

* info level

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rm raise

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Normalizer.select_verbalizer

Signed-off-by: Nikolay Karpov <[email protected]>

* Exception

Signed-off-by: Nikolay Karpov <[email protected]>

* verbose

Signed-off-by: Nikolay Karpov <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restart ci

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Nikolay Karpov <[email protected]>
Co-authored-by: Evelina <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* warning regardless of verbose flag (#107)

* warning

Signed-off-by: Nikolay Karpov <[email protected]>

* self.verbose

Signed-off-by: Nikolay Karpov <[email protected]>

---------

Signed-off-by: Nikolay Karpov <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Unpin setuptools (#106)

Signed-off-by: Peter Plantinga <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fixed warnings: File is not always closes. (#113)

Signed-off-by: Xuesong Yang <[email protected]>
Co-authored-by: Xuesong Yang <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* fix bug #111 (ar currencies) (#117)

* fix bug #111 (ar currencies)

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci folder

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* Logging clean up + IT TN fix (#118)

* fix utils and it TN

Signed-off-by: Evelina <[email protected]>

* clean up

Signed-off-by: Evelina <[email protected]>

* fix logging

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix format

Signed-off-by: Evelina <[email protected]>

* fix format

Signed-off-by: Evelina <[email protected]>

* add IT TN to CI

Signed-off-by: Evelina <[email protected]>

* update patch

Signed-off-by: Evelina <[email protected]>

---------

Signed-off-by: Evelina <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* Time_IT_TN (#105)

* add time verbalizer

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add time tagger and verba

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* add pytest time

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* codeQL

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix numbers with eight

Signed-off-by: GiacomoLeoneMaria <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <[email protected]>

* IT TN improvement on tests (#120)

* add missing test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* fix bug with time tests

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* add sentence test cases

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* refine shortest path for irregular cardinals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci date

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* add single letter exception for roman numerals (#121)

* add single letter exception for roman numerals

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

* update ci dir

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>

---------

Signed-off-by: Mariana Graterol Fuenmayor <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* rewrote tokenizer

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* removed the file and replaced it with char in 1.8

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* jenkins file update

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* to fix tn bug@ xuesong

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* tn bug

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* fixeds and updates

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Alex Cui <[email protected]>

* adjustments

Signed-off-by: BuyuanCui <[email protected]>
Signed-off-by: Alex Cui <[email protected]>

* testing commit

Signed-off-by: Alex Cui <[email protected]>

* removing unsed file

Signed-off-by: Alex Cui <[email protected]>

* updated test cases

Signed-off-by: Alex Cui <[email protected]>

* updating etst cases

Signed-off-by: Alex Cui <[email protected]>

* updates adapting to graphs

Signed-off-by: Alex Cui <[email protected]>

* updated …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants