Skip to content

Commit

Permalink
Italian_TN (#67)
Browse files Browse the repository at this point in the history
* add TN italian

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix init

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix LOCATION

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* modify graph_utils

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* correct decimals

Signed-off-by: GiacomoLeoneMaria <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix electronic

Signed-off-by: Giacomo Cavallini <[email protected]>

* fix measure

Signed-off-by: Giacomo Cavallini <[email protected]>

---------

Signed-off-by: GiacomoLeoneMaria <[email protected]>
Signed-off-by: Giacomo Cavallini <[email protected]>
Signed-off-by: Mariana <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mariana <[email protected]>
  • Loading branch information
3 people authored Jun 29, 2023
1 parent abe4a33 commit 9f3d372
Show file tree
Hide file tree
Showing 60 changed files with 2,181 additions and 2 deletions.
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/it/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/it/data/__init__ .py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.com punto com
.org punto org
.gov punto gov
.uk punto UK
.fr punto FR
.net punto net
.html punto html
.py punto python
.ru punto RU
.us punto US
.de punto DE
.it punto IT
.jpg punto jpeg
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
gmail
nvidia
outlook
hotmail
yahoo
live
yandex
orange
wanadoo
web
comcast
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
. punto
- trattino
_ trattino basso
! punto esclamativo
# cancelletto
$ dollaro
% percento
& e commerciale
' apostrofo
* asterisco
+ più
/ slash
= uguale
? punto interrogativo
^ esponente
` accento
{ apri graffa
| barra verticale
} chiudi graffa
~ tilde
, virgola
: due punti
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
d giorno
f fahrenheit
c celsius
°C grado celsius
°F grado fahrenheit
K kelvin
km chilometro
m metro
cm centimetro
mm millimetro
μm micrometro
nm nanometro
dm decimetro
pm picometro
hm ettometro
ha ettaro
mi miglio
metro quadrato -0.0001
km² chilometro quadrato -0.0001
m2 metro quadrato
km2 chilometro quadrato
m3 metro cubo
ft foot
g grammo
µg microgrammo
mg milligrammo
kg chilogrammo
lb libbra
h ora
s secondo
min minuto
ms millisecondo
μs microsecondo
hz hertz
kw kilowatt
ghz gigahertz
khz kilohertz
mhz megahertz
v volt
mc megacoulomb
mA milliampere
A ampere
tw terawatt
mv millivolt
mw megawatt
gw gigawatt
ω ohm
db decibel
gb gigabyte
kb kilobit
pb petabit
mb megabyte
kb kilobyte
tb terabyte
kv kilovolt
mv megavolt
kn kilonewton
ml millilitro
l litro
dl decilitro
bar bar
kcal chilocaloria
cal caloria
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
grado celsius gradi celsius
metro metri
ora ore
grammo grammi
secondo secondi
minuto minuti
chilometro chilometri
bit bits
byte bytes
caloria calorie
miglio miglia
litro litri
giorno giorni
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/it/data/money/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
euro
$ dollaro
£ sterlina
usd dollaro statunitense
eur euro
cad dollaro canadese
chf franco svizzero
hkd dollaro di hong kong
¥ yen
rs rupia
rublo
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
centesimo
$ centesimo
£ penny
usd centesimo
eur centesimo
cad centesimo
chf centesimo
hkd centesimo
¥ yen
rs paisa
copeca
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
sterlina sterline
rupia rupie
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
centesimo centesimi
euro euro
dollaro dollari
penny penny
dollaro canadese dollari canadesi
franco svizzero franchi svizzeri
dollaro statunitense dollari statunitensi
rublo rubli
dollaro di hong hong dollari di hong kong
copeca copeche
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
uno 1
due 2
tre 3
quattro 4
cinque 5
sei 6
sette 7
otto 8
nove 9
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
duecento 2
trecento 3
quattrocento 4
cinquecento 5
seicento 6
settecento 7
ottocento 8
novecento 9
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
milioni
miliardi
10 changes: 10 additions & 0 deletions nemo_text_processing/text_normalization/it/data/numbers/teen.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
dieci 10
undici 11
dodici 12
tredici 13
quattordici 14
quindici 15
sedici 16
diciassette 17
diciotto 18
diciannove 19
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
venti 2
trenta 3
quaranta 4
cinquanta 5
sessanta 6
settanta 7
ottanta 8
novanta 9
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ventuno 21
trentuno 31
quarantuno 41
cinquantuno 51
sessantuno 61
settantuno 71
ottantuno 81
novantuno 91
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
zero 0
65 changes: 65 additions & 0 deletions nemo_text_processing/text_normalization/it/data/whitelist.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
dr dottor
dr. dottor
Dr. dottor
dott. dottor
Dott. dottor
dott.sa dottoressa
Dott.sa dottoressa
prof. professore
Prof. professore
proff. professori
Proff. professori
prof.ssa professoressa
Prof.ssa professoressa
Sig. signore
Sig.na signorina
sig. signore
sig.na signorina
sig signore
Sigg. signori
sigg. signori
Sig.ri signori
sig.ri signori
Sig.re signore
sig.re signore
Sig.ra signora
sig.ra signora
Sig.re signorine
sig.re signorine
Avv. avvocato
avv. avvocato
Arch. architetto
arch. architetto
Rag. ragioniere
rag. ragioniere
Geom. geometra
geom. geometra
Ing. ingegnere
ing. ingegnere
Ingg. ingegneri
ingg. ingegneri
Fam. famiglia
fam. famiglia
f.lli fratelli
F.lli fratelli
Rev. reverendo
On. onorevole
on. onorevole
Onn. onorevoli
onn. onorevoli
Min. ministro
Sen. senatore
c.v.d. come volevasi dimostrare
ecc. eccetera
ecc eccetera
v.le viale
p.za piazza
l.go largo
loc. località
ist. istituto
segr. segreteria
set. settore
uff. ufficio
dip. dipartimento
div. divisione
+ più
13 changes: 13 additions & 0 deletions nemo_text_processing/text_normalization/it/taggers/__init__ .py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading

0 comments on commit 9f3d372

Please sign in to comment.