Native integration of pytorch_memlab or something like it #5189
Labels
feature
Is an improvement or enhancement
help wanted
Open to be worked on
won't fix
This will not be worked on
🚀 Feature
Fine-grained memory profiling in pytorch-lightning, that explains:
Motivation
Pitch
pytorch-lightning is designed to make it easy to train networks using pytorch. However, debugging utilization.memory and memory.used is very ad-hoc and tricky. Best-practices don't work all the time, and a very simple fine-grained profiler would be very useful, even for experts if they are writing complicated nets.
Alternatives
Additional context
Attached is a graph from wandb.ai dashboard of my utilization.memory:
I am tearing my hair out figuring why this is the case. As far as I can tell everything is on the GPU, and I don't know where the memory accesses are coming from. I'd love a one-liner tool that explained this, rather than poking around blind in a haphazard way.
The text was updated successfully, but these errors were encountered: