Components in a pipeline sometimes reach optimal validation score at very different times #10529
einarbmag
started this conversation in
Help: Best practices
Replies: 1 comment 2 replies
-
The best solution is to train the components separately. You can write a short third config that assembles the components with We do this for the provided trained pipelines ( |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Let's say I have both a classifier and NER in my pipeline, and I haven't found that multi-task learning is improving performance so I keep the components independent with their own embedding layer. In fact, it seems the optimal validation score for the two components happens at significantly different times in the training process. Selecting the best overall model by picking the checkpoint where the total score is the best, yields a model with components that may be under/overtrained.
What to do in this situation? I know I can separate into two model configs, one for each component, run training separately and then merge in code later. I'd really like to avoid that if possible. Looking at the training script though, I don't see a way for the components to be persisted independently. Another idea is to set different learning rates for the components (if possible), but that would be very fiddly. Any thoughts? Maybe I just have to write a script that does the training for each component separately and merges automatically at the end?
Beta Was this translation helpful? Give feedback.
All reactions