Generalization of `expert` `teacher_forcing` and `monotonicity` across model architectures #198

bonham79 · 2024-06-10T00:22:14Z

Something I've been thinking about with expansion of library: a decent amount of the work we've been using involves application of inductive biases and teacher-prompted training to model architecture. Currently we have:

Teacher-student forcing: lstms and transformers
Expert curricular training: edit action transducer
Monotonicity: hard attention lstm
Hard alignment: also hard attention lstm

One thing I would like to do with the next overhaul is modularize these beyond their respective models (like we're trying to do with #77 for teacher forcing) so that they can be 'dropped in' wherever. This would allow 'fun' combinations such as:

Feature-invariant transformer with monotonic assumptions and hard alignment
Hard Attention Transducer using SED alignments as an curricular guide.

A lot of these things won't necessarily click, but I believe adding this new modularity layer would allow easier use of curricular learning and exploration scheduling that isn't easy to implement in other libraries. Expanding utility.

(This is down the roads thought. Post-beta.)

Adamits · 2024-06-13T03:22:52Z

Without thinking through how these combinations would work too much, this sounds exciting and like a good idea! I am on board.

kylebgorman · 2024-06-13T14:51:32Z

Yeah that sounds like a Johns Hopkins PhD dissertation ;)

bonham79 · 2024-06-14T16:20:19Z

Yeah that sounds like a Johns Hopkins PhD dissertation ;)

Am I missing a reference for the JHU?

kylebgorman · 2024-06-14T16:25:27Z

Am I missing a reference for the JHU?

no it just used to be the home of this sort of thing

bonham79 · 2024-09-15T05:12:52Z

n.b. to myself, when generalizing expert , allow the opportunity to drop out expert for the transducer.

bonham79 added the enhancement New feature or request label Jun 10, 2024

bonham79 self-assigned this Jun 10, 2024

bonham79 mentioned this issue Oct 25, 2024

Generalize edit actions across architectures #261

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalization of `expert` `teacher_forcing` and `monotonicity` across model architectures #198

Generalization of `expert` `teacher_forcing` and `monotonicity` across model architectures #198

bonham79 commented Jun 10, 2024

Adamits commented Jun 13, 2024

kylebgorman commented Jun 13, 2024

bonham79 commented Jun 14, 2024

kylebgorman commented Jun 14, 2024

bonham79 commented Sep 15, 2024

Generalization of expert teacher_forcing and monotonicity across model architectures #198

Generalization of expert teacher_forcing and monotonicity across model architectures #198

Comments

bonham79 commented Jun 10, 2024

Adamits commented Jun 13, 2024

kylebgorman commented Jun 13, 2024

bonham79 commented Jun 14, 2024

kylebgorman commented Jun 14, 2024

bonham79 commented Sep 15, 2024

Generalization of `expert` `teacher_forcing` and `monotonicity` across model architectures #198

Generalization of `expert` `teacher_forcing` and `monotonicity` across model architectures #198