-
Notifications
You must be signed in to change notification settings - Fork 623
Open
Labels
bugSomething isn't workingSomething isn't working
Description
🐛 Bug
V5 ignores cartoon voices.
To Reproduce
Steps to reproduce the behavior:
- Using colab example
- Download this example and run until this cell (change 'en_example.wav' to 'ja_example.wav'):
wav = read_audio('ja_example.wav', sampling_rate=SAMPLING_RATE)
# get speech timestamps from full audio file
speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=SAMPLING_RATE)
pprint(speech_timestamps)
- The result is:
[{'end': 30464, 'start': 12032}]
while if old version is used (see SYSTRAN/faster-whisper#934 (comment)), the result is
[{'end': 40192, 'start': 12032},
{'end': 179456, 'start': 76544},
{'end': 379136, 'start': 273152},
{'end': 457984, 'start': 422656},
{'end': 630016, 'start': 576256},
{'end': 669952, 'start': 653056},
{'end': 863488, 'start': 695040},
{'end': 950528, 'start': 896768}]
Expected behavior
V5 should be better than older version.
Environment
Please copy and paste the output from this
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
- PyTorch Version (e.g., 1.0):
- OS (e.g., Linux):
- How you installed PyTorch (
conda
,pip
, source): - Build command you used (if compiling from source):
- Python version:
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
Additional context
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working