add support for safetensors in pytorch reader #2721

wandbrandon · 2025-01-20T18:34:52Z

Pull Request Template

Checklist

Confirmed that run-checks all script has been executed.
Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Changes

Simple addition to the already implemented reader.rs, supporting safetsensors format using candle with CPU device import.

Testing

in the examples/pytorch-import directory, there is a mnist.safetensors file that is successfully imported.

codecov · 2025-01-20T18:59:51Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.60%. Comparing base (140ea75) to head (6a0330e).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2721   +/-   ##
=======================================
  Coverage   83.60%   83.60%           
=======================================
  Files         819      819           
  Lines      106600   106605    +5     
=======================================
+ Hits        89124    89129    +5     
  Misses      17476    17476

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

laggui

Thanks for the addition 🙏

Looks pretty good overall, just some minor comments.

laggui · 2025-01-20T21:11:28Z

crates/burn-import/src/pytorch/reader.rs

-        .map(|(key, tensor)| (key, CandleTensor(tensor)))
-        .collect();
+    //check if it's a safetensors file
+    let is_safetensors = path.extension().is_some_and(|ext| ext == "safetensors");


Not sure if that's a very robust way to differentiate between both 😅 pretty sure I've seen a lot of .safetensor files (without the plural form).

I think we should add a field to the LoadArgs instead, and users can then specify when they're loading a safetensor file:

LoadArgs::new(...).with_safetensors(true)

laggui · 2025-01-20T21:12:02Z

examples/pytorch-import/build.rs

@@ -17,7 +17,7 @@ fn main() {

    // Load PyTorch weights into a model record.
    let record: model::ModelRecord<B> = PyTorchFileRecorder::<FullPrecisionSettings>::default()
-        .load("pytorch/mnist.pt".into(), &device)
+        .load("pytorch/mnist.safetensors".into(), &device)


To check if we should load the pickle file or safetensor, we can add a safetensors feature flag to the example and check it here with something like:

let ext = if std::env::var("CARGO_FEATURE_SAFETENSORS").is_ok() { "safetensors" } else { "pt" };

and then load the correct file.

We should update the README with a small mention.

Nikaidou-Shinku · 2025-01-21T06:55:08Z

IMO maybe we can have something like pub mod safetensors; under a new feature gate in crate burn-import so users could build exactly what they need, since safetensors does not seem to be a format strongly related to PyTorch.

wandbrandon · 2025-01-21T20:33:00Z

IMO maybe we can have something like pub mod safetensors; under a new feature gate in crate burn-import so users could build exactly what they need, since safetensors does not seem to be a format strongly related to PyTorch.

I think this is a good point, and it also builds the scaffolding for potentially rewriting it to remove the Candle dependency.

laggui · 2025-01-21T21:04:33Z

IMO maybe we can have something like pub mod safetensors; under a new feature gate in crate burn-import so users could build exactly what they need, since safetensors does not seem to be a format strongly related to PyTorch.

I agree that the format is not strongly related to pytorch, but I think most models available in safetensor format are pytorch models 😅

Unless you mean supporting the safetensor format as another recorder to load and save modules. In this case, not sure that this is a meaningful addition.

wandbrandon added 2 commits January 20, 2025 13:27

add support for safetensors in pytorch reader

15bac95

update book

6a0330e

wandbrandon mentioned this pull request Jan 20, 2025

Support importing safetensors format #626

Open

laggui requested changes Jan 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for safetensors in pytorch reader #2721

add support for safetensors in pytorch reader #2721

wandbrandon commented Jan 20, 2025

codecov bot commented Jan 20, 2025

laggui left a comment

laggui Jan 20, 2025

laggui Jan 20, 2025

Nikaidou-Shinku commented Jan 21, 2025

wandbrandon commented Jan 21, 2025

laggui commented Jan 21, 2025

add support for safetensors in pytorch reader #2721

Are you sure you want to change the base?

add support for safetensors in pytorch reader #2721

Conversation

wandbrandon commented Jan 20, 2025

Pull Request Template

Checklist

Related Issues/PRs

Changes

Testing

codecov bot commented Jan 20, 2025

Codecov Report

laggui left a comment

Choose a reason for hiding this comment

laggui Jan 20, 2025

Choose a reason for hiding this comment

laggui Jan 20, 2025

Choose a reason for hiding this comment

Nikaidou-Shinku commented Jan 21, 2025

wandbrandon commented Jan 21, 2025

laggui commented Jan 21, 2025