Skip to content

free up state_dict variable memory after loading checkpoint#533

Open
adistomar wants to merge 1 commit intokarpathy:masterfrom
adistomar:free-memory
Open

free up state_dict variable memory after loading checkpoint#533
adistomar wants to merge 1 commit intokarpathy:masterfrom
adistomar:free-memory

Conversation

@adistomar
Copy link
Copy Markdown

This PR frees up memory used by the state_dict variable after we are done using it to load a checkpoint. Not freeing this variable uses up a lot of unnecessary memory on GPUs.

IgorTavcar added a commit to IgorTavcar/nanoGPT that referenced this pull request Mar 5, 2026
Avoids keeping duplicate model weights in memory for the
entire training run.

Cherry-picked from karpathy#533

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant