1-Click is all you need.
- Get diff between base model and context expanded model.
- Get diff between base model and finetuned model(model to expand context). Then use this diff to calculate activation ratio.
- Add diff * (1 - activation ratio) to finetuned model.
Done !
Thanks to kuotient for letting me know issue on initial commit. Huggingface
Thanks to Sionic Ai for providing A100 cluster.