Skip to content
This repository has been archived by the owner on Oct 13, 2022. It is now read-only.

Why do ctc.py and mmi.py use use_double_scores=True? #183

Open
galv opened this issue May 6, 2021 · 5 comments
Open

Why do ctc.py and mmi.py use use_double_scores=True? #183

galv opened this issue May 6, 2021 · 5 comments

Comments

@galv
Copy link

galv commented May 6, 2021

I am referring to the following:

use_double_scores=True

log_semiring=True, use_double_scores=True)

This is a bit unusual to me. Was there a particular motivation? @csukuangfj it seems like you were the one who chose to use double instead of single precision.

@csukuangfj
Copy link
Collaborator

I believe double precision is more accurate for log_sum_exp.

Maybe @danpovey has more experience about this.

I have not compared the speed and accuracy between single and double precision.

@danpovey
Copy link
Contributor

danpovey commented May 6, 2021

It was out of a concern that for long utterances, we might get roundoff errors being different in the forward vs backward computions, and posteriors that don't sum to 1, causing possible lack of cancellation between num and den.
However I don't recall whether this was an actual issue in practice. We should test the effect on speed and WER again.

@galv
Copy link
Author

galv commented May 6, 2021

Okay. I definitely think it makes sense for long utterances to use double if we are in probability space. I will keep this in my mind as a potential knob to tune.

@danpovey
Copy link
Contributor

danpovey commented May 6, 2021 via email

@galv
Copy link
Author

galv commented May 6, 2021

Don't worry, I know that part (it would be concerning if I didn't!)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants