Running the huge model on CPU #41

cvinker · 2022-11-23T19:10:33Z

From my understanding, using the transformers accelerate tool, running the HUGE model means it needs to load the entire thing into RAM. Is there any way for it to process as it loads into ram, or is it a necessity? I have 614GB of ram, I am also curious if there's a way to edit the program while the model is stored in memory. Is there any way to change how it processes on the CPU? I know that the GPU can choose between FP32,16, and INT 8 but I don't know how to find info on running on CPU beyond the huggingface.co example.

domenicrosati · 2022-11-23T19:17:35Z

Could you use ONNX to optimize it first (ie. transformers optimum)? You may want to do that anyway if you are running on CPU.

cvinker · 2022-11-23T23:50:25Z

I spent a few hours fiddling with it but I kept getting errors, is ONNX better to the point I should investigate further? I think my system got botched messing around with it. Would ONNX reduce memory usage, I struggled to find what exactly it would improve. Thanks!

saptarshi059 · 2023-01-13T15:13:22Z

I tried using ONNX with this,.. I don't they have support for it yet,..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running the huge model on CPU #41

Running the huge model on CPU #41

cvinker commented Nov 23, 2022

domenicrosati commented Nov 23, 2022

cvinker commented Nov 23, 2022

saptarshi059 commented Jan 13, 2023

Running the huge model on CPU #41

Running the huge model on CPU #41

Comments

cvinker commented Nov 23, 2022

domenicrosati commented Nov 23, 2022

cvinker commented Nov 23, 2022

saptarshi059 commented Jan 13, 2023