Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Request to be able to config manifests of neuron-device-plugin #7175

Closed
HenryXie1 opened this issue Oct 12, 2023 · 3 comments
Closed
Labels
kind/feature New feature or request stale

Comments

@HenryXie1
Copy link

HenryXie1 commented Oct 12, 2023

Hi eksctl team
we are testing EKS Optimized GPU AMI: amazon-eks-gpu-node-1.24-v20231002

When I follow the guide
https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html

After your GPU nodes join your cluster via eksctl, after that, I saw the pods fail to start due to this error

Failed to pull image "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": rpc error: code = Unknown desc = failed to pull and unpack image "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": failed to resolve reference "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": failed to do request: Head "[https://public.ecr.aws/v2/neuron/neuron-device-plugin/manifests/2.1.2.0 ](https://public.ecr.aws/v2/neuron/neuron-device-plugin/manifests/2.1.2.0)": dial tcp: lookup public.ecr.aws on 10.8.192.2:53: no such host

per https://docs.aws.amazon.com/eks/latest/userguide/inferentia-support.html

When launching a node group with Inf1 instances, eksctl automatically installs the AWS Neuron Kubernetes device plugin. This plugin advertises Neuron devices as a system resource to the Kubernetes scheduler, which can be requested by a container

As public.ecr.aws is not allowed in intranet and we want to add some node affinity to this daemonset.
it would be nice for eksctl have ability to config manifests of neuron-device-plugin

What feature/behavior/change do you want?

Why do you want this feature?

@HenryXie1 HenryXie1 added the kind/feature New feature or request label Oct 12, 2023
@github-actions
Copy link
Contributor

Hello HenryXie1 👋 Thank you for opening an issue in eksctl project. The team will review the issue and aim to respond within 1-5 business days. Meanwhile, please read about the Contribution and Code of Conduct guidelines here. You can find out more information about eksctl on our website

Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Nov 12, 2023
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request stale
Projects
None yet
Development

No branches or pull requests

1 participant