Tensorflow doesn't recognize GPU on Ubuntu 18.04 with CUDA 9.1, CuDNN 7.1, Python 3.6, conda

Question

I run Python 3.6 within a conda environment on a Ubuntu 18.04 machine and tensorflow is not recognizing my GPU.

Output of lsb_release -a:

No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.1 LTS Release: 18.04 Codename: bionic

Information from nvidia-smi:

NVIDIA-SMI 390.77 Driver Version: 390.77 Tesla P40

Information from nvcc -V:

Cuda compilation tools, release 9.1, V9.1.85

Output from nvidia-debugdump -l:

Found 1 NVIDIA devices Device ID: 0 Device name: Tesla P40 GPU internal ID: 0322417022310

Output from lspci -nnk | grep -i nvidia:

a5dd:00:00.0 3D controller [0302]: NVIDIA Corporation GP102GL [Tesla P40] [10de:1b38] (rev a1) Subsystem: NVIDIA Corporation GP102GL [Tesla P40] [10de:11d9] Kernel driver in use: nvidia Kernel modules: nvidia_drm, nvidia

Output from conda --version:

conda 4.5.12

Output of echo $PATH (without the spaces):

/home/***/anaconda3/envs/tf_gpu/bin: /usr/local/cuda/bin: /home/***/.local/bin: /home/***/anaconda3/bin: /usr/local/sbin: /usr/local/bin: /usr/sbin: /usr/bin: /sbin: /bin: /usr/games: /usr/local/games: /snap/bin

Output of echo $LD_LIBRARY_PATH (without the spaces):

/usr/local/cuda/lib64: /usr/local/cuda/extras/CUPTI/lib64: /usr/local/cuda/lib64: /usr/local/cuda/extras/CUPTI/lib64:

Ok, so I am installing my env like this:

conda create -n tf_gpu python=3.6 tensorflow-gpu conda activate tf_gpu

This installs the following packages:

_tflow_select: 2.1.0-gpu absl-py: 0.6.1-py36_0 astor: 0.7.1-py36_0 blas: 1.0-mkl c-ares: 1.15.0-h7b6447c_1 ca-certificates: 2018.03.07-0 certifi: 2018.11.29-py36_0 cudatoolkit: 9.2-0 cudnn: 7.2.1-cuda9.2_0 cupti: 9.2.148-0 gast: 0.2.0-py36_0 grpcio: 1.16.1-py36hf8bcb03_1 h5py: 2.8.0-py36h989c5e5_3 hdf5: 1.10.2-hba1933b_1 intel-openmp: 2019.1-144 keras-applications: 1.0.6-py36_0 keras-preprocessing: 1.0.5-py36_0 libedit: 3.1.20170329-h6b74fdf_2 libffi: 3.2.1-hd88cf55_4 libgcc-ng: 8.2.0-hdf63c60_1 libgfortran-ng: 7.3.0-hdf63c60_0 libprotobuf: 3.6.1-hd408876_0 libstdcxx-ng: 8.2.0-hdf63c60_1 markdown: 3.0.1-py36_0 mkl: 2019.1-144 mkl_fft: 1.0.6-py36hd81dba3_0 mkl_random: 1.0.2-py36hd81dba3_0 ncurses: 6.1-he6710b0_1 numpy: 1.15.4-py36h7e9f1db_0 numpy-base: 1.15.4-py36hde5b4d6_0 openssl: 1.1.1a-h7b6447c_0 pip: 18.1-py36_0 protobuf: 3.6.1-py36he6710b0_0 python: 3.6.7-h0371630_0 readline: 7.0-h7b6447c_5 scipy: 1.1.0-py36h7c811a0_2 setuptools: 40.6.3-py36_0 six: 1.12.0-py36_0 sqlite: 3.26.0-h7b6447c_0 tensorboard: 1.12.0-py36hf484d3e_0 tensorflow: 1.12.0-gpu_py36he74679b_0 tensorflow-base: 1.12.0-gpu_py36had579c0_0 tensorflow-gpu: 1.12.0-h0d30ee6_0 termcolor: 1.1.0-py36_1 tk: 8.6.8-hbc83047_0 werkzeug: 0.14.1-py36_0 wheel: 0.32.3-py36_0 xz: 5.2.4-h14c3975_4 zlib: 1.2.11-h7b6447c_3

And then I check the available devices for tensorflow in a python console:

from tensorflow.python.client import device_lib print(device_lib.list_local_devices())

Which prints this:

2018-12-18 10:44:12.135984: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 14921140553341499580 , name: "/device:XLA_CPU:0" device_type: "XLA_CPU" memory_limit: 17179869184 locality { } incarnation: 17804082860482987174 physical_device_desc: "device: XLA_CPU device" ]

As you can see, no GPU. The funny thing is that if I install PyTorch it recognizes the GPU without any trouble. Now, I have tried various things that I have seen in other posts like removing the protobuf package and tensorflow-gpu from conda and reinstall it with pip but that didn't change anything.

What can I do to get tensorflow to recognize the GPU? Any help is highly appreciated!

Similar question that didn't help resolve my issue:

tensorflow on GPU: no known devices, despite cuda's deviceQuery returning a "PASS" result

For the CuDNN installation I followed this guide (up until the Bazel instructions):

https://medium.com/@asmello/how-to-install-tensorflow-cuda-9-1-into-ubuntu-18-04-b645e769f01d

Do you install CUDA and cudnn? Is your GPU device supported with CUDA? Do you have right paths, looks like that: export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} ? — nick nick
– nick nick, Commented Dec 18, 2018 at 15:32
@nicknick I added the information you requested to the question for clarification. Thanks for pointing this out. — cowhi
– cowhi, Commented Dec 19, 2018 at 12:50
thanks. sorry i cant help with that problem. it works good locally.( — nick nick
– nick nick, Commented Dec 19, 2018 at 14:26
I could not resolve the issue. In the end, I downgraded CUDA to 9.0 and installed everything around that... — cowhi
– cowhi, Commented Jan 29, 2019 at 1:19
Make sure CUDA/cuDNN are in the same path where the Tensorflow is installed. Thanks! — user11530462
– user11530462, Commented Feb 8, 2022 at 2:10

Collectives™ on Stack Overflow

Tensorflow doesn't recognize GPU on Ubuntu 18.04 with CUDA 9.1, CuDNN 7.1, Python 3.6, conda

0

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Linked