Skip to content

Commit

Permalink
Jp tn 20241017 (#240)
Browse files Browse the repository at this point in the history
* ja tn

Signed-off-by: Alex Cui <[email protected]>

* adding ja

Signed-off-by: Alex Cui <[email protected]>

* removing

Signed-off-by: Alex Cui <[email protected]>

* updated tests

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* addressing ci

Signed-off-by: Alex Cui <[email protected]>

* addressing ci

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* addresing comment

Signed-off-by: Alex Cui <[email protected]>

* removing

Signed-off-by: Alex Cui <[email protected]>

* adresing comment

Signed-off-by: Alex Cui <[email protected]>

* removing unused import

Signed-off-by: Alex Cui <[email protected]>

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* addressing comment;

Signed-off-by: Alex Cui <[email protected]>

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* date for ja

Signed-off-by: Alex Cui <[email protected]>

* addresing comment

Signed-off-by: Alex Cui <[email protected]>

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* jenkins

Signed-off-by: Alex Cui <[email protected]>

* addresing comment

Signed-off-by: Alex Cui <[email protected]>

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* typo

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* adressing comment

Signed-off-by: Alex Cui <[email protected]>

* addressing comment

Signed-off-by: Alex Cui <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci

Signed-off-by: Alex Cui <[email protected]>

---------

Signed-off-by: Alex Cui <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
BuyuanCui and pre-commit-ci[bot] authored Oct 18, 2024
1 parent a66e16b commit a3fc6f5
Show file tree
Hide file tree
Showing 61 changed files with 3,262 additions and 100 deletions.
2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ pipeline {
IT_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-22-24-0'
HY_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/03-12-24-0'
MR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/03-12-24-1'
JA_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/09-27-24-0'
JA_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/10-17-24-1'
DEFAULT_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'
}
stages {
Expand Down
18 changes: 18 additions & 0 deletions nemo_text_processing/text_normalization/ja/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


from nemo_text_processing.text_normalization.ja.taggers.tokenize_and_classify import ClassifyFst
from nemo_text_processing.text_normalization.ja.verbalizers.verbalize import VerbalizeFst
from nemo_text_processing.text_normalization.ja.verbalizers.verbalize_final import VerbalizeFinalFst
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/date/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
31 changes: 31 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/date/day.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
1
2
3
4
5
6
7
8
9
10
11 十一
12 十二
13 十三
14 十四
15 十五
16 十六
17 十七
18 十八
19 十九
20 二十
21 二十一
22 二十二
23 二十三
24 二十四
25 二十五
26 二十六
27 二十七
28 二十八
29 二十九
30 三十
31 三十一
12 changes: 12 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/date/era.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
令和
平成
昭和
大正
明治
西暦
和暦
西洋暦
グレゴリオ暦
紀元前
紀元
紀元後
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
R. 令和
H. 平成
S. 昭和
T. 大正
M. 明治
12 changes: 12 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/date/month.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
1
2
3
4
5
6
7
8
9
10
11 十一
12 十二
15 changes: 15 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/date/week.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
月曜日
火曜日
水曜日
木曜日
金曜日
土曜日
日曜日
祝日
月曜日
火曜日
水曜日
木曜日
金曜日
土曜日
日曜日
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
1
2
3
4
5
6
7
8
9
10 changes: 10 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/numbers/teen.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
10
11 十一
12 十二
13 十三
14 十四
15 十五
16 十六
17 十七
18 十八
19 十九
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
2 二十
3 三十
4 四十
5 五十
6 六十
7 七十
8 八十
9 九十
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0
23 changes: 23 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/symbol.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
& アンド
# ハッシュタグ
@ アット
§ セクション
トレードマーク
® 登録商標マーク
© 著作権
_ アンダースコア
% パーセント
* 星印
+ プラス
/ スラッシュ
= エコール
^ 曲折アクセント記号
| 縦棒
~ ティルダ
$ ドール
£ ポンド
ユーロ
ウォン
¥
°
º
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/time/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
23 changes: 23 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/time/division.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
今朝
今夜
今晩
午前
午後
夕方
夜中
夜半
早朝
明け方
深夜
毎朝
毎夜
毎晩
毎日
真夜中
翌日
未明
正午
真夜中の
24 changes: 24 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/time/hour.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
1
2
3
4
5
6
7
8
9
10
11 十一
12 十二
13 十三
14 十四
15 十五
16 十六
17 十七
18 十八
19 十九
20 二十
21 二十一
22 二十二
23 二十三
24 二十四
60 changes: 60 additions & 0 deletions nemo_text_processing/text_normalization/ja/data/time/minute.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
1
2
3
4
5
6
7
8
9
10
11 十一
12 十二
13 十三
14 十四
15 十五
16 十六
17 十七
18 十八
19 十九
20 二十
21 二十一
22 二十二
23 二十三
24 二十四
25 二十五
26 二十六
27 二十七
28 二十八
29 二十九
30 三十
31 三十一
32 三十二
33 三十三
34 三十四
35 三十五
36 三十六
37 三十七
38 三十八
39 三十九
40 四十
41 四十一
42 四十二
43 四十三
44 四十四
45 四十五
46 四十六
47 四十七
48 四十八
49 四十九
50 五十
51 五十一
52 五十二
53 五十三
54 五十四
55 五十五
56 五十六
57 五十七
58 五十八
59 五十九
60 六十
Loading

0 comments on commit a3fc6f5

Please sign in to comment.