Currently in our UnitTests we only test forward pass for HF and TP models.
It would make a lot of sense to also test backwards pass and compare loss and if possible gradient updates.
Probably it would make sense to put it into some kind of test utility to make it reusable for different models.