What are best practices of deploying transformers on GPUs w/re batching? #10337
DSLituiev
started this conversation in
Help: Best practices
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I need to run inference on multiple documents on GPU. I see this information here about multiprocessing + GPUs, but nothing about the GPUs alone. When I run my pipeline on a M1 Mac machine, it complains there no GPU. However, when I run it on a GPU machine, it does not seem to use GPUs (no new processes in
nvidia-smi
, and speed comparable to CPU inference). Is there a way to control whether and which GPU device is being used?Beta Was this translation helpful? Give feedback.
All reactions