Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=9165 /var/lib/docker/overlay2/2a1a1c3555109e20c5ba2e386cc3ce69cbb80c3850663c1909db8c46ed565c0c/merged]\\\\nnvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/2a1a1c3555109e20c5ba2e386cc3ce69cbb80c3850663c1909db8c46ed565c0c/merged/usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18: file exists\\\\n\\\"\"": unknown #295

Open
deepxiaobai opened this issue Apr 2, 2021 · 14 comments

Comments

@deepxiaobai
Copy link

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=9165 /var/lib/docker/overlay2/2a1a1c3555109e20c5ba2e386cc3ce69cbb80c3850663c1909db8c46ed565c0c/merged]\\nnvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/2a1a1c3555109e20c5ba2e386cc3ce69cbb80c3850663c1909db8c46ed565c0c/merged/usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18: file exists\\n\""": unknown

@deepxiaobai
Copy link
Author

This error is occurred when I run the "docker run --runtime=nvidia -it --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v /home/$USER/triton_blog/:/workspace/triton_blog nvcr.io/nvidia/l4t-tensorflow:r32.4.3-tf1.15-py3" command.

Environment:
Device: Jetson Xavier NX
CUDA_VERSION: 10.2
DeepStream5.1

Docker_Version: 19.03.6

@cvtutorials
Copy link

same error

1 similar comment
@cvtutorials
Copy link

same error

@elezar
Copy link
Member

elezar commented May 18, 2021

@deepxiaobai @sjtumelc which version of the the NVIDIA container toolkit components are you using?

@codegastudio
Copy link

codegastudio commented Jun 14, 2021

Same error
I built a new image on a Jetson AGX, based from nvcr.io/nvidia/l4t-base:r32.5.0.
The new image work fine on the Jetson AGX but i get the error when i want run it on a Jetson NX

@ChickenBites
Copy link

Same error while trying to run deepstream with fatbinaryloader in nvcr.io/nvidia/deepstream-l4t:5.1-21.02-base.

@elezar
Copy link
Member

elezar commented Jun 29, 2021

Looking at the contents of the image: nvcr.io/nvidia/deepstream-l4t:5.1-21.02-base:

ls -alt /usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18
-rw-r--r-- 1 root root 0 Feb 25 00:18 /usr/lib/aarch64-linux-gnu/libnvidia-fatbinaryloader.so.440.18

It contains a zero-sized file matching the name of the file that is being mounted from the host. This could indicate that there may have been an issue with building the container image.

@ChickenBites
Copy link

ChickenBites commented Jun 29, 2021

@elezar Ive actually linked libnvidia-fatbinaryloader.so.32.4.4 to libnvidia-fatbinaryloader.so.440.18 using symlink when i build the dockerfile, but now the pipeline won't load. Ive also tried the following images:
nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples
nvcr.io/nvidia/deepstream-l4t:5.1-21.02-iot
same thing.

@elezar
Copy link
Member

elezar commented Jun 29, 2021

Could you show how the symlinks have been set up?

@ChickenBites
Copy link

@elezar The directive is:

WORKDIR /usr/lib/aarch64-linux-gnu

RUN rm -f libnvidia-fatbinaryloader.so.440.18
&& ln -s libnvidia-fatbinaryloader.so.32.4.4 libnvidia-fatbinaryloader.so.440.18

@hoonkai
Copy link

hoonkai commented Sep 8, 2021

Same error on the Nano.

@gustavojoseleite
Copy link

Same error here on AGX Xavier. The strange thing is that other docker images work normally using nvidia-container.

@drinktee
Copy link

same error

@vertcli
Copy link

vertcli commented Feb 15, 2022

same error.

Solved pulling image nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf1.15-py3

docker pull nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf1.15-py3

@elezar elezar transferred this issue from NVIDIA/nvidia-docker Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants