Skip to content

Conversation

Kaihui-intel
Copy link
Contributor

@Kaihui-intel Kaihui-intel commented Oct 9, 2025

Type of Change

bug fix

Description

From transformers huggingface/transformers#41103:
completely remove offload_state_dict: it was added several years ago to avoid loading 2x memory on cpu between the model and the state dict -> this is no longer needed from some time as the model is loaded on meta device, then params are loaded one after the other -> removes quite a bit of convoluted logic

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: Kaihui-intel <[email protected]>
@PRAgent4INC
Copy link
Collaborator

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Code Order

The new condition for adding offload_state_dict is placed after the check for transformers version less than 4.51. This might lead to incorrect behavior if offload_state_dict should be set for versions between 4.51 and 4.57.

if parse(transformers.__version__) < parse("4.57"):
    tmp_kwargs["offload_state_dict"] = offload_state_dict

@PRAgent4INC
Copy link
Collaborator

PR Code Suggestions ✨

No code suggestions found for the PR.

@thuang6 thuang6 added this to the 3.6 milestone Oct 9, 2025
@Kaihui-intel Kaihui-intel requested a review from XuehaoSun October 9, 2025 08:06
@Kaihui-intel
Copy link
Contributor Author

AutoRound CI will fix in another PR #2291

@Kaihui-intel Kaihui-intel requested a review from xin3he October 9, 2025 08:35
@XuehaoSun XuehaoSun merged commit 81beafc into master Oct 9, 2025
20 of 23 checks passed
@XuehaoSun XuehaoSun deleted the kaihui/transformers_v4.57 branch October 9, 2025 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants