-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ptxas executable #72
Comments
Currently |
Understood. Thank you for the explination |
@leofang, I am a little confused about what is and is not available through the main cudatoolkit (the one in this stock). In particular:
I am primarily interested in essentially recreating what NVIDIA offers in their NGC containers in our packaging of tensorflow and pytorch; one of the key missing items I've been working on is activating XLA for tensorflow, which requires the correct compilers etc. being available. Do you have any advice? |
If not available, where are these things available in conda-forge? For example, I saw that some things are available through cudatoolkit-dev and I believe we have nvcc feedstock... Are we supposed to be using them that way? My understanding has been that these things should be bundled in cudatoolkit (as the first sentence above asserts) |
cc @jakirkham for viz and comment |
|
Where? Here?
Okay, let me try to see what exactly is needed for tensorflow and pytorch and we can work on addressing these issues as they come.
Since you talk about the EULA, etc. --- is using that conda-forge CI image for someone's production work okay or is it only for CI? I believe I saw it was based on the cuda-devel Docker images, so the licensing might be exactly the same as that (those cuda-devel images are the main building blocks for all NGC containers as far as I understand...) |
Thanks for the prompt and clear answer by the way, 👍 @leofang |
Btw, as far as I could tell, we are good for the XLA implementation, though I need to do more local testing to see if there are additional issues to resolve. @leofang, if you're interested in having a look, see conda-forge/tensorflow-feedstock#246 |
Yes
I think it's OK. The CUDA images and their derivatives (including conda-forge's) are permissive. By using them users acknowledge the terms and conditions. |
I have just encountered the issue that a particular TF model built from TF hub and run in Jupterlab in an Anaconda environment in which I had installed cuDNN and cudatoolkit raised issues because is missing from the conda-forge source. I confess I cannot follow the discussion above. Can someone explain in plainer language why this happened in CUDA toolkit 11.2 and what the future will be like? It doesn't make sense to someone me: if the DLL's etc are available then why not the ptxas.exe??? I'm a user... I just want it to run. (NB finding out why ptxas was an issue and what to do about it was a PITA - now I have an installation, can I just copy ptxas into the environment somewhere appropriate??) I had to install cuda in the OS, which undermines the value of having conda environments with the CUDA stuff in them. Setup: win-10 home 64 bit 21H2. UPDATE Issue finally resolved by But... hours to work this out. I'm going to give the TF crew a fair share of the blame for not giving enough info about ptxas and where TF is looking etc. Not their first offense... XLA_FLAGS is also screwed up |
Hmmm... view on github didn't seem to work, but here goes....
Error arose while executing this in a downloaded ViT (Vision Transformer
ipynb attached, without error info, showing working result after ptxas made
available)
model_url = 'https://tfhub.dev/sayakpaul/vit_s16_classification/1'
classification_model = tf.keras.Sequential(
[hub.KerasLayer(model_url)]
)
predictions = classification_model.predict(image)
predicted_label = imagenet_int_to_str[int(np.argmax(predictions))]
predicted_label
I can't recall (and didn't save :( ) the error details except that the
right ptxas was not found. I did find these notes though...
Then TF complains about the ptxas version (there are 2 on c: one at
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\ptxas.exe and
the other in
C:\Users\Julian\AppData\Roaming\Mathematica\Paclets\Repository\CUDAResources-Win64-10.5.0\CUDAToolkit\bin\ptxas.exe)
It says maybe I can just put the right version (?>11.1) in the cuda path,
i.e. into the OS CUDA 10
Obviously I wanted it to take it from the env, which *conda install -c
nvidia cuda-nvcc* achieved.TF is looking in the env first AFAICT, and then
falls back to an OS install - complex errors depending on env/os cuda
installation existence + version,
Setup is TF2.9.1
# Name Version Build Channel
tensorboard 2.9.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.9.1 pypi_0 pypi
tensorflow-addons 0.17.0 pypi_0 pypi
tensorflow-datasets 4.5.2 pypi_0 pypi
tensorflow-estimator 2.9.0 pypi_0 pypi
tensorflow-gan 2.1.0 pypi_0 pypi
tensorflow-hub 0.12.0 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.26.0 pypi_0 pypi
tensorflow-metadata 1.8.0 pypi_0 pypi
tensorflow-probability 0.16.0 pypi_0 pypi
tensorflow-text 2.9.0 pypi_0 pypi
cuda-nvcc 11.7.64 0 nvidia
cudatoolkit 11.2.2 h933977f_10 conda-forge
cupy-cuda112 10.5.0 pypi_0 pypi
Python 3.10.4, Win-10 Home, 64 bit.
HTH BR, Julian
…On Thu, 2 Jun 2022 at 02:28, ngam ***@***.***> wrote:
@JulianSMoore <https://github.com/JulianSMoore> What exactly is your
error and what tensorflow are you using? Upgrade to 2.8.1 if you can!
—
Reply to this email directly, view it on GitHub
<#72 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACHLK6N76DMHKBIRSEQLVP3VNAE2RANCNFSM5IRELETQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Not obvious to me that the notebook attached to email is available here, so attaching it separately in zip. |
@JulianSMoore sorry we can't be of more help. A few things:
it seems that you are installing things through the Tensorflow recommended ways (pypi) and as such I would recommend you as a question on their forums. We simply don't have the knowledge to help you troubleshoot your system on windows. |
@hmaarrfk Perfectly understood (wasn't expecting you to troubleshoot!) & your info will be helpful. For the benefit of others: some required s/w support for TensorFlow (e.g. ptxas, for ViT model from TF hub) seems to lie outside cudatoolkit. If you encounter a similar issue, first check your paths, then think about the libraries used/needed and finally consider packages from different channels. Hard to be more specific than that, unfortunately. (I use conda installation for cuda tookit and cuDNN because that is the only way I know to install cuda in an anaconda env (rather than OS) - everything else I do with pip) |
Any update on this? We need ptxas for tensorflow and jax going forward. I am not sure if there is any point in continuing our crazy efforts in maintaining cuda builds if we are not going to have access to ptxas: It is simply needed. I am personally not going to participate in any cuda builds in tensorflow and jaxlib until this is fixed. (I have been the primary pusher for the latest tensorflow and jaxlib builds as others are busier than usual.) At the end of the day, if someone has to install system cudatoolkit anyway, there is no point in getting it from conda-forge. I would be more inclined to pursue lighter builds along the lines of #81 instead. @conda-forge/cudatoolkit could we please get some clarity on this soon? Or at least a response about what is stopping us from resolving it? cc @conda-forge/core |
Isn't the main problem that the Nvidia EULA prevents us from distributing PTXAS and other binaries? We can't just ignore that. This all will go away when #62 lands and Nvidia officially distributes their packages on conda, which I assume would have a permissive enough license for us to redistribute. Til then... we can't do much, sorry. |
Let's close this issue now that it is resolved with CUDA 12. Thanks everyone for the discussion and request. |
I believe that the ptxas executable should be available here.
It seems that tensorflow (at least v1) attempts to use it.
However, when I create a fresh environment with cudatoolkit 11.2 it doesn't seem to be included.
Issue:
Environment (
conda list
):Details about
conda
and system (conda info
):xref: conda-forge/tensorflow-feedstock#170
The text was updated successfully, but these errors were encountered: