Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temperature changes are propagated to good portions of batches #8

Open
sbuser opened this issue Jan 20, 2023 · 1 comment
Open

Temperature changes are propagated to good portions of batches #8

sbuser opened this issue Jan 20, 2023 · 1 comment

Comments

@sbuser
Copy link

sbuser commented Jan 20, 2023

It looks to me like a single compression_ratio or avg_logprob which fails a threshold check causes the entire batch to have temperature incremented and be re-run with the higher temperature.

As batch_size increases I believe this makes it more likely that a single segment result with a parameter out of bounds will cause the entire batch to be reevaluated with higher temperature. With large enough batch sizes this may create a kind of toggling where temperature rapidly rises to 1.0 (or max) since the higher temperatures may create worse compression_ratio or avg_logprob in another segment in the batch?

I wonder if there's an efficient way to retain the good segments and only re-run the failed segments? The entire inference is re-run against the batch currently so it should be maximally inefficient right now - is there any reason in principle the re-run couldn't be with the smaller batch_size of eg just 1?

@Blair-Johnson
Copy link
Owner

We should be able to track which DecodingResults failed and which ones succeeded and re-run only the failed segments. We could either re-run only the failed segments within transcribe_with_fallback, which would still block further pipeline execution for the rest of the batch, or we could track fallback and temperature on a per-audio basis external of tanscribe_with_fallback. The latter one seems like it could be faster, but the first one could be an easy enough first step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants