-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profanity filtering for ITN - EN #86
Commits on Jul 3, 2023
-
* Add ZH ITN Signed-off-by: Anand Joseph <[email protected]> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <[email protected]> * Remove invalid tests Signed-off-by: Anand Joseph <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <[email protected]> * Cleanup Signed-off-by: Anand Joseph <[email protected]> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <[email protected]> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <[email protected]> * update for langauge import Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <[email protected]> * a new class for whitelist Signed-off-by: BuyuanCui <[email protected]> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <[email protected]> * recreated due to file import format issue Signed-off-by: BuyuanCui <[email protected]> * recreated due to format issue Signed-off-by: BuyuanCui <[email protected]> * caught duplicates, removed Signed-off-by: BuyuanCui <[email protected]> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <[email protected]> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <[email protected]> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <[email protected]> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <[email protected]> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <[email protected]> * updates Signed-off-by: BuyuanCui <[email protected]> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <[email protected]> * gramamr for Fraction Signed-off-by: BuyuanCui <[email protected]> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <[email protected]> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <[email protected]> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <[email protected]> * arrangements Signed-off-by: BuyuanCui <[email protected]> * added whitelist grammar Signed-off-by: BuyuanCui <[email protected]> * word grammar for non-classified items Signed-off-by: BuyuanCui <[email protected]> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <[email protected]> * updates according to last PR Signed-off-by: BuyuanCui <[email protected]> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <[email protected]> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <[email protected]> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <[email protected]> * adjustment on the weight Signed-off-by: BuyuanCui <[email protected]> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <[email protected]> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <[email protected]> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <[email protected]> * verbalizer for fraction Signed-off-by: BuyuanCui <[email protected]> * added for mandarin grammar Signed-off-by: BuyuanCui <[email protected]> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <[email protected]> * merge conflict Signed-off-by: BuyuanCui <[email protected]> * removed unsed imports Signed-off-by: BuyuanCui <[email protected]> * deleted unsed import os Signed-off-by: BuyuanCui <[email protected]> * deleted unsed variables Signed-off-by: BuyuanCui <[email protected]> * removed unsed imports Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <[email protected]> * updates and edits based on pr checks Signed-off-by: BuyuanCui <[email protected]> * format issue, reccreated Signed-off-by: BuyuanCui <[email protected]> * format issue recreated Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <[email protected]> * fixed coding style and format Signed-off-by: BuyuanCui <[email protected]> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <[email protected]> * removed the comment Signed-off-by: BuyuanCui <[email protected]> * removed the comment Signed-off-by: BuyuanCui <[email protected]> * removing unnecessary comments Signed-off-by: BuyuanCui <[email protected]> * unnecessary comment removed Signed-off-by: BuyuanCui <[email protected]> * test file updated for more cases Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <[email protected]> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <[email protected]> * added Mandarin as zh Signed-off-by: BuyuanCui <[email protected]> * removing for dplication Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <[email protected]> * removed duplicates Signed-off-by: BuyuanCui <[email protected]> * removing unsed imports Signed-off-by: BuyuanCui <[email protected]> * updates to fix test file failures Signed-off-by: BuyuanCui <[email protected]> * updates to fix file failtures Signed-off-by: BuyuanCui <[email protected]> * updates to resolve test case failture Signed-off-by: BuyuanCui <[email protected]> * updates to resolve test case failure Signed-off-by: BuyuanCui <[email protected]> * updates to resolve test case failure Signed-off-by: BuyuanCui <[email protected]> * updates to resolve test case failure Signed-off-by: BuyuanCui <[email protected]> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <[email protected]> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <[email protected]> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <[email protected]> * fix style Signed-off-by: BuyuanCui <[email protected]> * fix style Signed-off-by: BuyuanCui <[email protected]> * fix style Signed-off-by: BuyuanCui <[email protected]> * fixing pr checks Signed-off-by: BuyuanCui <[email protected]> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <[email protected]> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Signed-off-by: Buyuan(Alex) Cui <[email protected]> Signed-off-by: BuyuanCui <[email protected]> Co-authored-by: Alex Cui <[email protected]> Co-authored-by: Anand Joseph <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a8078de - Browse repository at this point
Copy the full SHA a8078deView commit details -
Add profanity filtering for english ITN
Signed-off-by: Gayathri Ethiraj <[email protected]> Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 29a6272 - Browse repository at this point
Copy the full SHA 29a6272View commit details -
Signed-off-by: Gayathri Ethiraj <[email protected]> Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d65ff7d - Browse repository at this point
Copy the full SHA d65ff7dView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d70a4ec - Browse repository at this point
Copy the full SHA d70a4ecView commit details -
Add filter_profanity attr to InverseNormalizer
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 252bb6d - Browse repository at this point
Copy the full SHA 252bb6dView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2e71ebb - Browse repository at this point
Copy the full SHA 2e71ebbView commit details -
Different fst names with/without pf
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a8a7826 - Browse repository at this point
Copy the full SHA a8a7826View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0cfb3a8 - Browse repository at this point
Copy the full SHA 0cfb3a8View commit details -
Rm written form in TSV and use fst operations to get it
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 62efdd6 - Browse repository at this point
Copy the full SHA 62efdd6View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1d5a362 - Browse repository at this point
Copy the full SHA 1d5a362View commit details -
user configurable input file for profane words
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f9e5bde - Browse repository at this point
Copy the full SHA f9e5bdeView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b0a6a98 - Browse repository at this point
Copy the full SHA b0a6a98View commit details -
Merge branch 'main' into add-profanity-filtering
Signed-off-by: Gayathri Ethiraj <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b3375c2 - Browse repository at this point
Copy the full SHA b3375c2View commit details
Commits on Jul 25, 2023
-
Configuration menu - View commit details
-
Copy full SHA for e6548dd - Browse repository at this point
Copy the full SHA e6548ddView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for a46ea2d - Browse repository at this point
Copy the full SHA a46ea2dView commit details
Commits on Aug 7, 2023
-
Configuration menu - View commit details
-
Copy full SHA for bf4e9a1 - Browse repository at this point
Copy the full SHA bf4e9a1View commit details -
disable filtering profanity by default
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3c79f42 - Browse repository at this point
Copy the full SHA 3c79f42View commit details -
Remove raising explicit ValueError when custom list is not passed
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1ce954c - Browse repository at this point
Copy the full SHA 1ce954cView commit details -
Set filer_profanity to True in profane test
Signed-off-by: gayu-thri <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9c710c4 - Browse repository at this point
Copy the full SHA 9c710c4View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 9880c05 - Browse repository at this point
Copy the full SHA 9880c05View commit details