Skip to content

Fixes prediction across all architectures #273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 2, 2024

Conversation

kylebgorman
Copy link
Contributor

@kylebgorman kylebgorman commented Nov 30, 2024

Two bugs found by testing prediction across all architectures:

  • Hard attention, some RNN-backer pointer-generator, and transducer models all inherit the beam search implementation in RNNModel, but this incompatible so it needs to be disabled. Failing to do this results in an inscrutable error instead of a NotImplementedError. This is fixed in all locations. In each case I made the decision to place the exception relatively "low" (i.e., more derived) in the class hierarchy, so if someone actually did implement beam_decode on these classes everything would just work.
  • The base model implementation of the prediction loop is made to have what seems like the obvious polymorphic return type which restores compatibility for hard attention. I suspect this is a minor regression introduced in Added beam search for LSTM #257; it has been fixed.

Clean-ups done at the same time:

  • Standardizes the name of the special symbol: it's called EOS at various points in the documentation but the code actually calls it END and its tag is <E>.
  • Uses the names beam_decode and greedy_decode everywhere.

@kylebgorman kylebgorman marked this pull request as ready for review December 1, 2024 02:17
@kylebgorman kylebgorman requested a review from Adamits December 1, 2024 02:17
@kylebgorman kylebgorman added the bug Something isn't working label Dec 1, 2024
Copy link
Collaborator

@Adamits Adamits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one nit, but LGTM.

Thanks for taking the time to do this, you're doing the lord's work.

@kylebgorman kylebgorman merged commit 0a91f56 into CUNY-CL:master Dec 2, 2024
8 checks passed
@kylebgorman kylebgorman deleted the predict2 branch December 2, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants