Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hindi ITN Support for Cardinal, Decimal, Ordinal, Fraction, Date, Time, Money and Measure #223

Merged
merged 30 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
50fdc44
Hindi ITN Support for Cardinal, Decimal, Ordinal, Fraction, Date, Time
tarushi2k2 Aug 21, 2024
cc8f97c
Cleanup
tarushi2k2 Aug 21, 2024
a45a553
Cleanup
tarushi2k2 Aug 21, 2024
7e7f22f
Committing all changes made
tarushi2k2 Aug 21, 2024
4009673
Updated date.py and added more test cases to cardinal for improved ac…
tarushi2k2 Aug 26, 2024
07f9d1c
Updated date.py
tarushi2k2 Aug 26, 2024
502ada1
Added hi to Jenkins and cleanup
tarushi2k2 Aug 29, 2024
bc58fec
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 29, 2024
69f5720
Changes and cleanup based on feedback
tarushi2k2 Sep 2, 2024
6f5af97
Changes and cleanup based on feedback
tarushi2k2 Sep 2, 2024
2799c03
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
d311945
Resolved conflicts
tarushi2k2 Sep 4, 2024
ef12371
Committing code for measure.py
tarushi2k2 Sep 4, 2024
1b86549
Cleanup
tarushi2k2 Sep 4, 2024
73ca416
Cleanup
tarushi2k2 Sep 4, 2024
71f6b6c
changes to run_evaluate.py
tarushi2k2 Sep 25, 2024
2b18ce8
Hindi ITN for money.py
tarushi2k2 Sep 30, 2024
bb4e888
Changes and cleanup
tarushi2k2 Oct 14, 2024
e365feb
Cleanup
tarushi2k2 Oct 16, 2024
c457a93
Cleanup
tarushi2k2 Oct 17, 2024
a833c5a
Cleanup date verbalizer
tarushi2k2 Oct 18, 2024
72fd9c7
Cleanup
tarushi2k2 Oct 21, 2024
88ca49f
Cleanup
tarushi2k2 Oct 21, 2024
f80f86a
Cleanup
tarushi2k2 Oct 22, 2024
bf8a8d4
Cleanup
tarushi2k2 Oct 23, 2024
dcf33b3
Cleanup
tarushi2k2 Oct 23, 2024
21b0496
Cleanup
tarushi2k2 Oct 29, 2024
8922144
Cleanup
tarushi2k2 Oct 29, 2024
34e1087
pushing .gitignore file from main branch
tarushi2k2 Oct 30, 2024
fd7b5c5
Merge branch 'main' into hi_itn
tarushi2k2 Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ pipeline {
HY_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/03-12-24-0'
MR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/03-12-24-1'
JA_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/10-17-24-1'
HI_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/10-29-24-0'
DEFAULT_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'
}
stages {
Expand Down Expand Up @@ -92,6 +93,23 @@ pipeline {

}
}
stage('L0: Create HI TN/ITN Grammars') {
when {
anyOf {
branch 'main'
changeRequest target: 'main'
}
}
failFast true
parallel {
stage('L0: Hi ITN grammars') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/inverse_text_normalization/inverse_normalize.py --language hi --text="बीस" --cache_dir ${HI_TN_CACHE}'
}
}

}
}

stage('L0: Create DE/ES TN/ITN Grammars') {
when {
Expand Down Expand Up @@ -323,6 +341,11 @@ pipeline {
sh 'CUDA_VISIBLE_DEVICES="" pytest tests/nemo_text_processing/es/ -m "not pleasefixme" --cpu --tn_cache_dir ${ES_TN_CACHE}'
}
}
stage('L1: Run all HI TN/ITN tests (restore grammars from cache)') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" pytest tests/nemo_text_processing/hi/ -m "not pleasefixme" --cpu --tn_cache_dir ${HI_TN_CACHE}'
}
}
stage('L1: Run all Codeswitched ES/EN TN/ITN tests (restore grammars from cache)') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" pytest tests/nemo_text_processing/es_en/ -m "not pleasefixme" --cpu --tn_cache_dir ${ES_EN_TN_CACHE}'
Expand Down Expand Up @@ -476,4 +499,4 @@ pipeline {
cleanWs()
}
}
}
}
17 changes: 17 additions & 0 deletions nemo_text_processing/inverse_text_normalization/hi/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from nemo_text_processing.inverse_text_normalization.hi.taggers.tokenize_and_classify import ClassifyFst
from nemo_text_processing.inverse_text_normalization.hi.verbalizers.verbalize import VerbalizeFst
from nemo_text_processing.inverse_text_normalization.hi.verbalizers.verbalize_final import VerbalizeFinalFst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
१ एक
२ दो
३ तीन
४ चार
५ पाँच
६ छः
६ छ:
६ छह
६ छे
७ सात
८ आठ
९ नौ
१० दस
११ ग्यारह
१२ बारह
१३ तेरह
१४ चौदह
१५ पन्द्रह
१६ सोलह
१७ सत्रह
१८ अठारह
१९ उन्नीस
२० बीस
२१ इक्कीस
२२ बाईस
२३ तेईस
२४ चौबीस
२५ पच्चीस
२६ छब्बीस
२७ सत्ताईस
२८ अट्ठाईस
२९ उनतीस
३० तीस
३१ इकतीस
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
जनवरी
फ़रवरी
फरवरी
मार्च
अप्रैल
अप्रील
मई
जून
जुलाई
अगस्त
सितंबर
अक्टूबर
नवंबर
दिसंबर
Loading
Loading