Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to skip the batch when OOM happens? #652

Open
Xinheng-He opened this issue Sep 6, 2024 · 1 comment
Open

How to skip the batch when OOM happens? #652

Xinheng-He opened this issue Sep 6, 2024 · 1 comment

Comments

@Xinheng-He
Copy link

Hi developers:

Hydra-lightning is a really cool tool and I like it! However, my batch includes highly different size of graphs and sometimes it causes OOM issues. Previously I would manually skip this batch but in hydra-lightning it seems hard to do this. Is it possible to add it in a future version, or how can I skip batch when such batch OOM (out of memory in GPU)?

Xinheng

@Xinheng-He
Copy link
Author

image
I made it by adding a module in trainind_step like this, however, when I run the code on multiple GPUs, it stops when return None (no matter clean or not), I think maybe such trick can only be played on single GPU training.
Wish it helps for others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant