You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been using Whisper-large-v2 model from HuggingFace for local testing, and found setting 'return_timestamps=True' parameter in the ASR pipeline returns timestamped transcriptions (see below code snippet from HuggingFace page).
I would like to have access to these segment-level timestamps for an application I am working on, but it seems this parameter is not exposed in the Whisper Triton deployment here. Can anyone guide me on how I could set this parameter / access this output?
`import torch
from transformers import pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"] [{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]`
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have been using Whisper-large-v2 model from HuggingFace for local testing, and found setting 'return_timestamps=True' parameter in the ASR pipeline returns timestamped transcriptions (see below code snippet from HuggingFace page).
I would like to have access to these segment-level timestamps for an application I am working on, but it seems this parameter is not exposed in the Whisper Triton deployment here. Can anyone guide me on how I could set this parameter / access this output?
`import torch
from transformers import pipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-large-v2",
chunk_length_s=30,
device=device,
)
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[0]["audio"]
prediction = pipe(sample.copy(), batch_size=8)["text"]
" Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
[{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
'timestamp': (0.0, 5.44)}]`
Beta Was this translation helpful? Give feedback.
All reactions