Skip to content

Feature request: Support splitting model weights and training states into separate checkpoint files #21170

@yilin404

Description

@yilin404

Description & Motivation

🚀 Feature Request: Support splitting model weights and training states into separate checkpoint files

Feature Request

Currently, PyTorch Lightning saves the entire training state (model weights, optimizer states, scheduler states, trainer state, etc.) into a single .ckpt file.

I would like to have an option to separate model weights (and config) from training states when saving checkpoints.

For example, the desired checkpoint structure could look like this:

checkpoints/
pretrained_model/
config.json # model configuration
model.safetensors # model weights only
training_states.pth # optimizer, LR scheduler, trainer states

Pitch

No response

Alternatives

No response

Additional context

No response

cc @lantiga @Borda

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions