OOM issue #49

yiyexy · 2024-04-08T13:43:01Z

When I tried to load the llava-qwen72B model, I encountered an out-of-memory issue on the H800 graphics card. It seems that this framework assigns a complete model to each GPU. How can I slice the model so that it doesn't cause an out-of-memory problem?

kcz358 · 2024-04-09T01:29:24Z

Hi, you can refer to this #12 and #4

yiyexy closed this as completed Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM issue #49

OOM issue #49

yiyexy commented Apr 8, 2024

kcz358 commented Apr 9, 2024

OOM issue #49

OOM issue #49

Comments

yiyexy commented Apr 8, 2024

kcz358 commented Apr 9, 2024