Script Batch Size

Hello, thank you for sharing your work.  

I was looking at the experimental setup details provided in the documentation (or README), and I have a question regarding the effective batch size calculation.  

The table states:  
- **Hardware:** 2 GPUs  
- **Details:** 8 per device, 4 grad accum steps    
- **Resulting Batch Size:** 32      

However, based on the standard calculation `(Batch per Device * Num GPUs * Grad Accum Steps)`, the effective batch size should be: `8 * 2 * 4 = 64`  

Could you please clarify if the effective batch size is indeed 64 (and 32 is a typo), or if there is a different configuration for the batch size per device or accumulation steps?  

Thank you!  

<img width="1192" height="538" alt="Image" src="https://github.com/user-attachments/assets/ad07a2b8-6e14-4e38-a22a-38c413857e77" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Script Batch Size #164

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Script Batch Size #164

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions