Skip to content

[BUG] Deepspeed / transformers mismatch #945

@pascal-pfeiffer

Description

@pascal-pfeiffer

🐛 Bug

When training with deepspeed there is an error that appears to be due to changes in transformers and a resulting mismatch between deepspeed and transformers:

  File "/workspace/.venv/lib/python3.10/site-packages/deepspeed/runtime/config_utils.py", line 57, in __init__
    super().__init__(**data)
  File "/workspace/.venv/lib/python3.10/site-packages/pydantic/main.py", line 250, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for DeepSpeedBF16Config
loss_scale_window
  Extra inputs are not permitted [type=extra_forbidden, input_value=100, input_type=int]
    For further information visit https://errors.pydantic.dev/2.12/v/extra_forbidden

To Reproduce

  • Train a model with deepspeed (default settings and bf16).

LLM Studio version

f1cb8b6

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions