You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[internal] load metadata for nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3:
Dockerfile:2
1 | ARG FROM_IMAGE_NAME=nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3
2 | >>> FROM ${FROM_IMAGE_NAME}
3 |
4 | RUN pip install --upgrade webdataset
ERROR: failed to solve: nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3: pulling from host nvcr.io failed with status code [manifests 23.09-py3]: 401 Unauthorized
The text was updated successfully, but these errors were encountered:
I think your original error is that you should do docker login nvcr.io first, but I guess nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3 is also deprecated, so we should directly use NeMo container instead (https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo)
However, I have issues with dependencies after using NeMo container (nvcr.io/nvidia/nemo:23.08, nvcr.io/nvidia/nemo:23.10) after running docker build with the updated Dockerfile
Hunk #1 succeeded at 614 (offset 4 lines).
patching file strategies/ddp.py
Hunk #1 FAILED at 191.
1 out of 1 hunk FAILED -- saving rejects to file strategies/ddp.py.rej
patching file trainer/connectors/logger_connector/result.py
Hunk #1 FAILED at 502.
Hunk #2 FAILED at 512.
2 out of 2 hunks FAILED -- saving rejects to file trainer/connectors/logger_connector/result.py.rej
The command '/bin/sh -c PL_ROOT=$(python -c "import pytorch_lightning; print(pytorch_lightning.file.replace('/init.py',''))"); patch -p3 -d${PL_ROOT} -i /source/lightning.v1.9.4.patch' returned a non-zero code: 1
Docker build for stable diffusion failed as shown below (tried NVIDIA's and Dell's implementations).
Could you please help.
docker build . -t nvidia_stablediffusion_pytorch_mlperf3.1 [+] Building 1.7s (3/3) FINISHED docker:default
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 719B 0.0s
=> ERROR [internal] load metadata for nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3 1.7s
Dockerfile:2
1 | ARG FROM_IMAGE_NAME=nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3
2 | >>> FROM ${FROM_IMAGE_NAME}
3 |
4 | RUN pip install --upgrade webdataset
ERROR: failed to solve: nvcr.io/ea-bignlp/ea-mm-sd-alpha/bignlp-mm-sd:23.09-py3: pulling from host nvcr.io failed with status code [manifests 23.09-py3]: 401 Unauthorized
The text was updated successfully, but these errors were encountered: