The goal of this repo is to have reproducible examples of how to fine tune a language model, and why you would do so.
Fine Tune LLM
- kick tires on base LLM, prompt, response
- Collect dataset to fix
- Eval more robustly
- Fine tune on dataset
- Kick tires again
- Eval more robustly