Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalization of expert teacher_forcing and monotonicity across model architectures #198

Open
bonham79 opened this issue Jun 10, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@bonham79
Copy link
Collaborator

Something I've been thinking about with expansion of library: a decent amount of the work we've been using involves application of inductive biases and teacher-prompted training to model architecture. Currently we have:

  • Teacher-student forcing: lstms and transformers
  • Expert curricular training: edit action transducer
  • Monotonicity: hard attention lstm
  • Hard alignment: also hard attention lstm

One thing I would like to do with the next overhaul is modularize these beyond their respective models (like we're trying to do with #77 for teacher forcing) so that they can be 'dropped in' wherever. This would allow 'fun' combinations such as:

  • Feature-invariant transformer with monotonic assumptions and hard alignment
  • Hard Attention Transducer using SED alignments as an curricular guide.

A lot of these things won't necessarily click, but I believe adding this new modularity layer would allow easier use of curricular learning and exploration scheduling that isn't easy to implement in other libraries. Expanding utility.

(This is down the roads thought. Post-beta.)

@bonham79 bonham79 added the enhancement New feature or request label Jun 10, 2024
@bonham79 bonham79 self-assigned this Jun 10, 2024
@Adamits
Copy link
Collaborator

Adamits commented Jun 13, 2024

Without thinking through how these combinations would work too much, this sounds exciting and like a good idea! I am on board.

@kylebgorman
Copy link
Contributor

Yeah that sounds like a Johns Hopkins PhD dissertation ;)

@bonham79
Copy link
Collaborator Author

Yeah that sounds like a Johns Hopkins PhD dissertation ;)

Am I missing a reference for the JHU?

@kylebgorman
Copy link
Contributor

Am I missing a reference for the JHU?

no it just used to be the home of this sort of thing

@bonham79
Copy link
Collaborator Author

n.b. to myself, when generalizing expert , allow the opportunity to drop out expert for the transducer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants