Skip to content

Loss calculation of RewardTrainer may be inaccurate when performing gradient accumulation? #4104

@jue-jue-zi

Description

@jue-jue-zi

num_items_in_batch=None,

It seems that RewardTrainer does not use num_items_in_batch in the loss calculation. According to the transformers Trainer documentation, self.model_accepts_loss_kwargs = False must be set to ensure the loss is computed correctly when performing gradient accumulation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    ⏳ needs more infoAdditional information or clarification is required to proceed🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions