Discuss about compile config #6683

oraluben · 2024-10-29T02:39:00Z

After #5581, torch dynamo can be enabled by engine.compile and there's no longer a config for that, which simplify things a lot.

Internally, we are primarily supporting a few models, specifically LLaMA and its variants. To minimize user effort, we aim to:

Automatically set leaf modules: For example, LlamaDecoderLayer for LLaMA.
Adjust prefetch arguments: Optimize prefetch settings when dynamo is enabled (max_live_parameters and prefetch_bucket_size to be specific)

Previously, in DeepSpeed, we could detect if dynamo was enabled via ds_config, allowing us to apply these adjustments seamlessly. However, with the recent changes, it is now challenging to determine if dynamo is enabled during the ds_init phase.

To solve that, IMO there're several options:

just warn when dynamo is enabled later and the tuning I mentioned is not enabled, ask the user to change their code/config
introduce new args/envs for that
do that in upper layer frameworks e.g. oraluben/transformers@7f18b1c and oraluben/accelerate@113df58

We're currently migrating from option 2 to option 3, for that's easy to maintain and add less ad-hoc logic in deepspeed. We'd like to know your options here, not limited to if you think this is a good idea overall, what do you think is the best way to implement, would you like to get this upstreamed.

Thanks!

@tohtana @loadams cc @SunMarc @tjruwase

The text was updated successfully, but these errors were encountered:

tjruwase assigned tohtana Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discuss about compile config #6683

Discuss about compile config #6683

oraluben commented Oct 29, 2024 •

edited

Loading

Discuss about compile config #6683

Discuss about compile config #6683

Comments

oraluben commented Oct 29, 2024 • edited Loading

oraluben commented Oct 29, 2024 •

edited

Loading