fix: mistral nemo does not recognize token_type_ids in forward #2233

NanoCode012 · 2025-01-03T08:36:08Z

Description

The issue does not appear for mistral 7b v03 even though they used same source model type mistral.

Mistral Nemo uses PreTrainedTokenizerFast for tokenizer class which returns token_type_ids https://github.com/huggingface/transformers/blob/42865860ec6dc135972d9555753cb7ee17f51fb4/src/transformers/tokenization_utils_base.py#L1397 whereas mistral 7b 03 uses LlamaTokenizer which doesn't https://github.com/huggingface/transformers/blob/42865860ec6dc135972d9555753cb7ee17f51fb4/src/transformers/models/llama/tokenization_llama.py#L128

A more future proof method could be following LlamaFactory where they check the .forward signature of the model and drop token_type_ids if not found

Motivation and Context

How has this been tested?

Confirmed fixes mistral Nemo for packing.

The issue did not appear without packing from limited testing.

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

fix: mistral nemo does not recognize token_type_ids in forward

2941367

NanoCode012 mentioned this pull request Jan 3, 2025

Mistral Nemo 12B Completion Training Fails from unexpected keyword argument 'token_type_ids' #2225

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: mistral nemo does not recognize token_type_ids in forward #2233

fix: mistral nemo does not recognize token_type_ids in forward #2233

NanoCode012 commented Jan 3, 2025

fix: mistral nemo does not recognize token_type_ids in forward #2233

Are you sure you want to change the base?

fix: mistral nemo does not recognize token_type_ids in forward #2233

Conversation

NanoCode012 commented Jan 3, 2025

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)