You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that RewardTrainer does not use num_items_in_batch in the loss calculation. According to the transformersTrainer documentation, self.model_accepts_loss_kwargs = False must be set to ensure the loss is computed correctly when performing gradient accumulation?