KD-NLP

This repository is a collection of Knowledge Distillation (KD) methods implemented by the Huawei Montreal NLP team.

Included Projects

MATE-KD
- KD for model compression and study of use of adversarial training to improve student accuracy using just the logits of the teacher as in standard KD.
- MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation
Combined-KD
- Proposition of Combined-KD (ComKD) that takes advantage of data-augmentation and progressive training.
- How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding
Minimax-kNN
- A sample-efficient semi-supervised kNN data augmentation technique.
- Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax
Glitter
- A universal sample-efficient framework for incorporating augmented data into training.
- When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

License

This project's license is under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Annealing_KD		Annealing_KD
Combined-KD		Combined-KD
DyLoRA		DyLoRA
Glitter		Glitter
MATE-KD		MATE-KD
Minimax-kNN		Minimax-kNN
README.md		README.md