Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data parser doesn't work #4

Open
zparcheta opened this issue Oct 6, 2017 · 2 comments
Open

data parser doesn't work #4

zparcheta opened this issue Oct 6, 2017 · 2 comments

Comments

@zparcheta
Copy link

zparcheta commented Oct 6, 2017

the example from README.md python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser returns the following error:

File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/data/forked/asr-study/extras/make_dataset.py", line 32, in
regex=True)
File "utils/generic_utils.py", line 62, in get_from_module
(name, module, ', '.join(members.keys())))
KeyError: 'brsp not found in datasets*.\n Valid values are: dummy, sid, brsd, voxforge, lapsbm, cslu, datasetparser'

If I change brsp for brsd (which is the available parser in dataset folder) then

datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/lapsbm/LapsBM-F019/LapsBM_0378.wav has a forbidden label: "acertou o alvo em quarenta e três por cento das suas chances". Skipping
Traceback (most recent call last):
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/data/forked/asr-study/extras/make_dataset.py", line 46, in
override=args.override)
File "datasets/dataset_parser.py", line 128, in to_h5
group = f.create_group(dataset)
File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group
gid = h5g.create(self.id, name, lcpl=lcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804)
File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929)
ValueError: Unable to create group (Name already exists)

The warning appears for each line of text and is skipping it.
How can I prepare data to training? I have already downloaded the data in data folder.

@igormq
Copy link
Owner

igormq commented Oct 12, 2017

Sorry about this problem. I have changed the behavior of label_parser and did not update the README.md. Instead

python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser

please use

python -m extras.make_dataset --parser brsd --input_parser mfcc

to create the dataset. Then I think that the rest of the README.md is fine.

Let me know anything!

@igormq igormq closed this as completed Oct 21, 2017
@zparcheta
Copy link
Author

zparcheta commented Oct 31, 2017

I think that there are already some problems. Some of them are because the path to wav files is not correct e.g. asr-study/data/voxforge/brunox-20110225-wqa/216.wav should be asr-study/data/voxforge/brunox-20110225-wqa/ wav/ 216.wav
Other files simply don't exist. Also there are some warnings about forbidden labels and I don't know exactly what that means.

If you know how I can run the data parser command properly, please tell me :)
regards!

python -m extras.make_dataset --parser brsd --input_parser mfcc

datasets.dataset_parser.VoxForge: ERROR File /data/forked/asr-study/data/voxforge/brunox-20110225-wqa/216.wav not found
datasets.dataset_parser.BRSD: WARNING Skipping dataset cslu: Dataset directory provided is not a directory
datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014053.wav has a forbidden label: "". Skipping
datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014054.wav has a forbidden label: "". Skipping
datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014055.wav has a forbidden label: "". Skipping
datasets.dataset_parser.Sid: ERROR File /data/forked/asr-study/data/sid/M0001/M0001000.wav not found
could not be converted in int.ROR age
Traceback (most recent call last):
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/data/forked/asr-study/extras/make_dataset.py", line 46, in
override=args.override)
File "datasets/dataset_parser.py", line 128, in to_h5
group = f.create_group(dataset)
File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group
gid = h5g.create(self.id, name, lcpl=lcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804)
File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929)
ValueError: Unable to create group (Name already exists)

@igormq igormq reopened this Nov 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants