fix: update/pin dependencies to get ONNX runtime working again (#107)

#### Motivation Internal regression tests are failing when using the ONNX Runtime with an error indicating a dependency issue with ONNX Runtime and cuDNN: ``` Shard 0: 2024-07-31 19:38:04.423164988 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1426 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.9: cannot open shared object file: No such file or directory ``` I found that ORT 1.18.1 started to build against cudnn 9 (included in the [release notes](https://github.com/Microsoft/onnxruntime/releases/tag/v1.18.1)). However, PyTorch does not use cudnn 9 until 2.4.0, so I pinned in to 1.18.0. In updating poetry.lock, I let other deps update as well, but found other compatibility issue and had to pin transformers and optimum as well to get internal tests passing. #### Modifications - pin the onnxruntime version to 1.18.0 - pin transformers to 4.40.2 (and remove separate `pip install` for it) - pin optimum to 1.20 - run `poetry update` to update poetry.lock #### Result `DEPLOYMENT_FRAMEWORK=hf_optimum_ort` will start working again and internal tests will be passing. --------- Signed-off-by: Travis Johnson <[email protected]>
IBM · Aug 5, 2024 · 015070b · 015070b
1 parent 572e03f
commit 015070b
Show file tree

Hide file tree

Showing 3 changed files with 655 additions and 695 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -164,9 +164,6 @@ RUN cd server && \
     make gen-server && \
     pip install ".[accelerate]" --no-cache-dir
 
-# temp: install newer transformers lib that optimum clashes with
-RUN pip install transformers==4.40.0 tokenizers==0.19.1 --no-cache-dir
-
 # Patch codegen model changes into transformers
 RUN cp server/transformers_patch/modeling_codegen.py ${SITE_PACKAGES}/transformers/models/codegen/modeling_codegen.py
 
@@ -290,9 +287,6 @@ COPY server server
 # Ref: https://onnxruntime.ai/docs/install/#install-onnx-runtime-gpu-cuda-12x
 RUN cd server && make gen-server && pip install ".[accelerate, ibm-fms, onnx-gpu, quantize]" --no-cache-dir --extra-index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
 
-# temp: install newer transformers lib that optimum clashes with
-RUN pip install transformers==4.40.0 tokenizers==0.19.1 --no-cache-dir
-
 # Patch codegen model changes into transformers 4.35
 RUN cp server/transformers_patch/modeling_codegen.py ${SITE_PACKAGES}/transformers/models/codegen/modeling_codegen.py