Skip to content

DPOTrainer: pick vision path by processing object type; always pass a tokenizer to chat template #4062

@ginkyenglee

Description

@ginkyenglee

Feature request

  • Decide is_vision_model by the actual processing object type:

    • PreTrainedTokenizerBase → text-only path (tokenize_row)
    • ProcessorMixin (with .tokenizer) → vision path (process_row)
  • In the chat-template step, always pass a tokenizer (if a processor was provided, use its .tokenizer). This keeps the chat templating code agnostic to multimodal processors.

Motivation

When running DPO on multimodal families (e.g., Gemma 3) with text-only data, users pass a tokenizer. The current code infers is_vision_model from model.config.model_type and subsequently calls process_row, which assumes a processor and accesses .tokenizer, leading to:

AttributeError: <TokenizerClass> has no attribute tokenizer

Your contribution

I will submit PR fot it

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions