Skip to content

Subtract n from batch_size found by BatchSizeFinder with mode='binsearch' #21154

@sergey-protserov-uhn

Description

@sergey-protserov-uhn

Description & Motivation

I have tried BatchSizeFinder callback and I find it awesome, one less hyper-parameter to worry about, thank you for the great work you all do!

I especially liked the idea of mode='binsearch', but after running in this mode with default values of other parameters (steps_per_trial=3, max_trials=25) I hit CUDA out-of-memory error, presumably because the found batch_size was too close to the maximum that my GPU could handle, and some other rogue allocation happened and caused OOM.

This made me think that another parameter that would allow the user to leave a certain amount of memory free "just in case" would make mode='binsearch' much more robust.

Pitch

Say, I have a suspicion that a memory equivalent of reserve_n_items items may be allocated later on for some reason, then I specify this value during BatchSizeFinder initialization, and once it finds the maximum fitting batch_size value, it subtracts reserve_n_items from it and proceeds with this new value.

Alternatives

Same as above, but with a fraction multiplier, like, proceed with multiplier_parameter * found_batch_size? I like this idea less, because it's easier to reason in terms of (and estimate memory requirements of) item counts, not fractions.

Additional context

No response

cc @lantiga @Borda

Metadata

Metadata

Assignees

Labels

featureIs an improvement or enhancementtuner

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions