Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there any way in the meantime to request more than 1 replica from each GPU in my node? #929

Open
wei1793786487 opened this issue Aug 27, 2024 · 4 comments

Comments

@wei1793786487
Copy link

Image

I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.

Is there any way in the meantime to request more than 1 replica from each GPU in my node?

@wei1793786487
Copy link
Author

My configuration file is version: v1 sharing: mps: resources: - name: nvidia.com/gpu replicas: 10

@agrogov
Copy link

agrogov commented Sep 25, 2024

If failRequestsGreaterThanOne=true were set in either of these configurations (MPS or TimeSlicing) and a user requested more than one nvidia.com/gpu or nvidia.com/gpu.shared resource in their pod spec, then the container would fail with the resulting error you've seen.

@ZYWNB666
Copy link

I also want to know the answer to this question, you can apply for multiple gpu resources when you do not use mps, but you cannot apply for multiple gpu resources once you use mps
@agrogov

@agrogov
Copy link

agrogov commented Nov 14, 2024

@ZYWNB666 as I got failRequestsGreaterThanOne is always set to true when MPS is used, so no way to change this behaviour...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants