Skip to content

Conversation

laggui
Copy link
Member

@laggui laggui commented Aug 4, 2025

Checklist

  • Confirmed that cargo run-checks command has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

#2373
#3448

The current implementation prioritizes the number of workers, so the dataset is first split into num_workers partial subsets. Then each partial dataset is used to retrieve a batch. This can yield additional partial batches and doesn't match most users' expected behavior.

Changes

  • Added PartialDataset::split_chunks which evenly distributes complete chunks/batches across multiple partial datasets.
  • Changed multi-thread dataloader behavior to use split_chunks when BatchStrategy defines the expected batch size (default with fixed batch size)

Testing

Added unit tests

Copy link

codecov bot commented Aug 4, 2025

Codecov Report

❌ Patch coverage is 98.78049% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 62.52%. Comparing base (689aa77) to head (d94a956).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...rates/burn-core/src/data/dataloader/multithread.rs 97.05% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3476      +/-   ##
==========================================
+ Coverage   62.49%   62.52%   +0.03%     
==========================================
  Files        1016     1016              
  Lines      113736   113816      +80     
==========================================
+ Hits        71080    71165      +85     
+ Misses      42656    42651       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@laggui laggui merged commit c7a05db into main Aug 5, 2025
10 checks passed
@laggui laggui deleted the feat/data/split-chunks branch August 5, 2025 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant