Skip to content

GRPO on InternVL3.5 does not support dynamic vision tokens #4061

@sanchit97

Description

@sanchit97

Description

The issue likely occurs because InternVL uses a dynamic amount of Vision tokens, which are replaced by <IMG_CONTEXT>.
Setting the size of each image to a fixed resolution fixes the error, but it is less than ideal.

Reproduction

training_args = GRPOConfig(
        output_dir=test
        bf16=True,
        remove_unused_columns = False,
        per_device_train_batch_size=4,
        num_train_epochs=4,
        logging_steps=50,
        max_prompt_length = 4096,
        eval_strategy="steps",
        eval_steps=500,
        max_completion_length = 512,
        num_generations = 4,
        learning_rate = 2e-7, 
        )

        trainer = GRPOTrainer(
            model=model, # internvl3.5 type
            args=training_args,
            reward_funcs=[..],
            train_dataset=grpo_train_dataset,
            eval_dataset=grpo_eval_dataset,  
            processing_class=processor,
        )
        
        trainer.train()

outputs:

Traceback (most recent call last):
  ...
[rank0]:   File "/mnt/home/../miniconda3/envs/../lib/python3.10/site-packages/transformers/models/internvl/modeling_internvl.py", line 654, in get_placeholder_mask
[rank0]:     raise ValueError(
[rank0]: ValueError: Image features and image tokens do not match: tokens: 1536, features 512

System Info

trl==0.22

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete

Metadata

Metadata

Assignees

No one assigned

    Labels

    🏋 GRPORelated to GRPO🐛 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions