-
Notifications
You must be signed in to change notification settings - Fork 609
Support Ubuntu 20.04 #512
Comments
It seems the python support also doesn't work on 20.04 because it's looking for libpython3.6m.so.1.0. 20.04 comes with python3.8.2 and there's no easy way to get python 3.6. |
Can you tell me what specifically you did to encounter this problem, so that I can make sure that the ubuntu20.04 builds don't have this problem? |
Tried running swift-jupyter as described here. When starting the kernel, I saw errors like:
|
I think the issue of python 3.6 vs 3.8 was a symptom of me trying to use a release that was built on Ubuntu 18.04 on 20.04. I built the toolchain from source and got a build to succeed on 20.04 with CUDA 11.0 and CUDNN 8.0.2. The only real bug I had to fix is described here: |
I made some progress: #535 I'm still waiting on https://gitlab.com/nvidia/container-images/cuda/-/issues/83 before I can add cuda toolchains for ubuntu 20.04. |
@marcrasi toolchains have been updated! |
I tried to make a CUDA build for ubuntu20.04, but there is still a small blocker: The version of TF that we use (2.3) supports CUDA 11.0 but not CUDA 11.1, and nvidia publishes docker images for ubuntu20.04 CUDA 11.1 but not CUDA 11.0. I'm not sure if TF 2.4 supports CUDA 11.1, but I'll try again once we upgrade to TF 2.4 (which we're trying to do soon) |
@marcrasi it's my understanding that 2.4 is the first release that officially supports cuda 11.0 (https://github.com/tensorflow/tensorflow/releases/tag/v2.4.0), not sure how you got 11.0 working in the first place (a master pull?). Cuda 11.1 is the release that supports the new ampere consumer cards (11.0 is just for the a100 series), so it would be nice to have that in particular (tensorflow/tensorflow#44750). 11.2 is already out as well! |
also, @texasmichelle you might run this and look at the logs being spit out:
|
@brettkoonce Can you share what you're seeing? I'm getting a warning about disk size, but otherwise that command seems to be working. Are you running in a project that has quota? |
Or are you pointing this out as an example of a toolchain running with cuda 11 support? |
@texasmichelle I was seeing some weird errors when running swift-models (eg lenet-mnist), but in retrospect what's going on is that I think you packaged the 10.2 cuda version with your deep learning build. After pulling the cuda 11 build (eg swift-tensorflow-RELEASE-0.12-cuda11.0-cudnn8-ubuntu18.04.tar.gz) everything works fine. It might be worth considering moving to 11.0 going forward. Still seeing tensorflow/swift-models#704 fwiw. |
ah, I see what you mean. I also tried using |
@brettkoonce You can expect to see DLVMs with v0.12 right after the freeze, e.g. by Jan. 8. I also verified that cuda 11.0 is included in the existing toolchain and will remain going forward. |
1 week ago =>
So could we got ubuntu precompiled with cuda (preferably 11.1 version for amper support :D [ |
Ubuntu 20.04 LTS was released on April 23, 2020. It would be nice to support this latest LTS version.
Here's what I've needed to do to get version 0.11 working on ubuntu 20.04:
sudo apt install libncurses5 libtinfo5
So maybe just adding that to the installation instructions for now would be a good start. Updating the code to support the newer libs would be another option.
The text was updated successfully, but these errors were encountered: