-
Notifications
You must be signed in to change notification settings - Fork 303
Issues: NVIDIA/gpu-operator
NOTICE: Containers losing access to GPUs with error: "Failed ...
#485
opened Feb 7, 2023 by
cdesiniotis
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
discovery-worker can't be ready. failed to read cpufreq directory
#1116
opened Nov 17, 2024 by
loprx
bug: operator anti-pattern, validator pod deployments cause
CrashBackLoop
behaviour
#1114
opened Nov 13, 2024 by
justinthelaw
container-toolkit fails to start after upgrading to v24.9.0 on k3s cluster
bug
Issue/PR to expose/discuss/fix a bug
#1109
opened Nov 7, 2024 by
logan2211
NVIDIA Device Plugin Only Exposes One GPU Out of Two GPUs Installed on Single Node
#1079
opened Oct 29, 2024 by
amir-bialek
chroot: failed to run command 'nvidia-smi': No such file or directory
#1063
opened Oct 24, 2024 by
vanloswang
ServiceAccount
node-feature-discovery
should not be included in ClusterRoleBinding when nfd.enabled: false
#1038
opened Oct 14, 2024 by
cmontemuino
Allow adding custom labels and securityContext to the components deployed by ClusterPolicy
#1030
opened Oct 10, 2024 by
inesshz
Not able to view Gpu utilization metrics in openshift dashboard
#1002
opened Sep 20, 2024 by
umeshvw
Following gpu-operator documentation will break RKE2 cluster after reboot
#992
opened Sep 16, 2024 by
aiicore
containerd restart from nvidia-container-toolkit causes other daemonsets to get stuck
#991
opened Sep 13, 2024 by
chiragjn
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.