You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.
Is there any way in the meantime to request more than 1 replica from each GPU in my node?
The text was updated successfully, but these errors were encountered:
If failRequestsGreaterThanOne=true were set in either of these configurations (MPS or TimeSlicing) and a user requested more than one nvidia.com/gpu or nvidia.com/gpu.shared resource in their pod spec, then the container would fail with the resulting error you've seen.
I also want to know the answer to this question, you can apply for multiple gpu resources when you do not use mps, but you cannot apply for multiple gpu resources once you use mps @agrogov
I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.
Is there any way in the meantime to request more than 1 replica from each GPU in my node?
The text was updated successfully, but these errors were encountered: