You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After your GPU nodes join your cluster via eksctl, after that, I saw the pods fail to start due to this error
Failed to pull image "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": rpc error: code = Unknown desc = failed to pull and unpack image "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": failed to resolve reference "public.ecr.aws/neuron/neuron-device-plugin:2.1.2.0": failed to do request: Head "[https://public.ecr.aws/v2/neuron/neuron-device-plugin/manifests/2.1.2.0 ](https://public.ecr.aws/v2/neuron/neuron-device-plugin/manifests/2.1.2.0)": dial tcp: lookup public.ecr.aws on 10.8.192.2:53: no such host
When launching a node group with Inf1 instances, eksctl automatically installs the AWS Neuron Kubernetes device plugin. This plugin advertises Neuron devices as a system resource to the Kubernetes scheduler, which can be requested by a container
As public.ecr.aws is not allowed in intranet and we want to add some node affinity to this daemonset.
it would be nice for eksctl have ability to config manifests of neuron-device-plugin
What feature/behavior/change do you want?
Why do you want this feature?
The text was updated successfully, but these errors were encountered:
Hello HenryXie1 👋 Thank you for opening an issue in eksctl project. The team will review the issue and aim to respond within 1-5 business days. Meanwhile, please read about the Contribution and Code of Conduct guidelines here. You can find out more information about eksctl on our website
Hi eksctl team
we are testing EKS Optimized GPU AMI: amazon-eks-gpu-node-1.24-v20231002
When I follow the guide
https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html
After your GPU nodes join your cluster via eksctl, after that, I saw the pods fail to start due to this error
per https://docs.aws.amazon.com/eks/latest/userguide/inferentia-support.html
As
public.ecr.aws
is not allowed in intranet and we want to add some node affinity to this daemonset.it would be nice for eksctl have ability to config manifests of neuron-device-plugin
What feature/behavior/change do you want?
Why do you want this feature?
The text was updated successfully, but these errors were encountered: