Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Versions of CUDA, cuDNN, and TensorFlow all need to be compatible #22

Open
uschmidt83 opened this issue Nov 19, 2018 · 1 comment
Open

Comments

@uschmidt83
Copy link
Member

Hi,

the plugin currently uses an old TensorFlow (TF) version (1.6.0), which is incompatible with the installation requirements of more recent TF versions. Specifically, TF currently requires cuDNN >= 7.2 to be installed, which is too new for TF 1.6.0 (see error below).

Furthermore, installation of cuDNN is currently not mentioned in the documentation.

Best,
Uwe

$ ./ImageJ-linux64 --java-home /sw/apps/jdk/current
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
[INFO] imagej-tensorflow version: 1.0.1
[INFO] tensorflow version: 1.6.0
[INFO] The current library path is: LD_LIBRARY_PATH=/sw/apps/cuda/9.0.176/lib64:/sw/apps/cuda/9.0.176/lib:/home/uschmidt/sw/local/lib:/home/uschmidt/tmp/new_fiji/Fiji.app/lib/linux64:/home/uschmidt/tmp/new_fiji/Fiji.app/mm/linux64
[INFO] loading model net_tubulin from source http://csbdeep.bioimagecomputing.com/model-tubulin.zip
[INFO] TensorFlow model cache: /home/uschmidt/tmp/new_fiji/Fiji.app/models
2018-11-19 16:19:18.450296: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2018-11-19 16:19:18.451773: I tensorflow/cc/saved_model/loader.cc:240] Loading SavedModel with tags: { serve }; from: /home/uschmidt/tmp/new_fiji/Fiji.app/models/net_tubulin
2018-11-19 16:19:18.631059: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-19 16:19:18.631955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:05:00.0
totalMemory: 11.93GiB freeMemory: 11.71GiB
2018-11-19 16:19:18.632026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-11-19 16:22:32.125101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11339 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0, compute capability: 5.2)
2018-11-19 16:22:32.283239: I tensorflow/cc/saved_model/loader.cc:159] Restoring SavedModel bundle.
2018-11-19 16:22:32.329594: I tensorflow/cc/saved_model/loader.cc:194] Running LegacyInitOp on SavedModel bundle.
2018-11-19 16:22:32.330448: I tensorflow/cc/saved_model/loader.cc:289] SavedModel load for tags { serve }; Status: success. Took 193878674 microseconds.
[INFO] Shape of input tensor: [-1, -1, -1, 1]
[INFO] Shape of output tensor: [-1, -1, -1, 2]
datasetAxes:[X, Y]
nodeAxes:[Time, Y, X, Channel]
mapping:[2, 1, 0, 3]
--------------
[INFO] Normalize ..
[INFO] Dataset type: 32-bit signed float, converting to FloatType.
[INFO] Dataset dimensions: [720, 576]
[INFO] INPUT NODE:
datasetAxes:[X, Y]
nodeAxes:[Time, Y, X, Channel]
mapping:[2, 1, 0, 3]
--------------
[INFO] OUTPUT NODE:
datasetAxes:[X, Y]
nodeAxes:[Time, Y, X, Channel]
mapping:[2, 1, 0, 3]
--------------
[INFO] Dividing image into 1 tile(s)..
[INFO] Size of single image tile: [720, 576]
[INFO] Final image tiling: [1, 1]
[INFO] Network input size: [720, 576]
[INFO] Processing tile 1..
2018-11-19 16:22:33.661466: E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7104 (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2018-11-19 16:22:33.662665: F tensorflow/core/kernels/conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
[1]    28406 abort (core dumped)  ./ImageJ-linux64 --java-home /sw/apps/jdk/current
@sommerc
Copy link

sommerc commented May 16, 2019

Hi,

currently the csbdeep update site for fiji provides tensorflow 1.12 bindings (tensorflow_jni), which are linked against the Cuda Toolkit 9.0 and require cuDNN >=7.2.1

In case you want to use Cuda Toolkit 10.0, you can get the lastest tensorflow 1.13 (tensorflow_jni.dll, libtensorflow.jar) bindings from: https://www.tensorflow.org/install/lang_java

Together with cuDNN==7.5.1 everything works nicely!

Would be cool if some of the version requirements could be mentioned in the doc!

Cheers,
Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants