51

I have aldready spent a considerable of time digging around on stack overflow and else looking for the answer, but couldn't find anything

Hi all,

I am running Tensorflow with Keras on top. I am 90% sure I installed Tensorflow GPU, is there any way to check which install I did?

I was trying to do run some CNN models from Jupyter notebook and I noticed that Keras was running the model on the CPU (checked task manager, CPU was at 100%).

I tried running this code from the tensorflow website:

# Creates a graph. a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # Creates a session with log_device_placement set to True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # Runs the op. print(sess.run(c)) 

And this is what I got:

MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0 2017-06-29 17:09:38.783183: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0 b: (Const): /job:localhost/replica:0/task:0/cpu:0 2017-06-29 17:09:38.784779: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] b: (Const)/job:localhost/replica:0/task:0/cpu:0 a: (Const): /job:localhost/replica:0/task:0/cpu:0 2017-06-29 17:09:38.786128: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] a: (Const)/job:localhost/replica:0/task:0/cpu:0 [[ 22. 28.] [ 49. 64.]] 

Which to me shows I am running on my CPU, for some reason.

I have a GTX1050 (driver version 382.53), I installed CUDA, and Cudnn, and tensorflow installed without any problems. I installed Visual Studio 2015 as well since it was listed as a compatible version.

I remember CUDA mentioning something about an incompatible driver being installed, but if I recall correctly CUDA should have installed its own driver.

Edit: I ran theses commands to list the available devices

from tensorflow.python.client import device_lib print(device_lib.list_local_devices()) 

and this is what I get

[name: "/cpu:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 14922788031522107450 ] 

and a whole lot of warnings like this

2017-06-29 17:32:45.401429: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations. 

Edit 2

Tried running

pip3 install --upgrade tensorflow-gpu 

and I get

Requirement already up-to-date: tensorflow-gpu in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages Requirement already up-to-date: markdown==2.2.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: html5lib==0.9999999 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: werkzeug>=0.11.10 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: wheel>=0.26 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: bleach==1.5.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: six>=1.10.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: protobuf>=3.2.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: backports.weakref==1.0rc1 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: numpy>=1.11.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu) Requirement already up-to-date: setuptools in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from protobuf>=3.2.0->tensorflow-gpu) 

Solved: Check comments for solution. Thanks to all who helped!

I am new to this, so any help is greatly appreciated! Thank you.

18
  • 2
    can you check that you do not have several tensorflow versions installed by running pip list and check for all lines with tensorflow Commented Jun 29, 2017 at 15:50
  • 4
    You should unistall tensorflow and keep tensorflow-gpu: pip uninstall tensorflow Commented Jun 29, 2017 at 16:08
  • 4
    Okay, I think I fixed it. I think when I uninstalled tensorflow it deleted the init.py file or something. So I ran pip install --ignore-installed --upgrade and now this from tensorflow.python.client import device_lib print(device_lib.list_local_devices()) shows the gpu as one of the devices. Commented Jun 29, 2017 at 17:41
  • 2
    I tried the above steps, it doesnt show gpu as a device. Tensorflow-gpu and tensorflow-tensorboard are shown in list of installed. Any help? Commented Jan 20, 2018 at 23:10
  • 6
    for ver>1.15, tensorflow-gpu is included with tensorflow tensorflow.org/install/gpu Commented Mar 3, 2020 at 12:53

8 Answers 8

45

To check which devices are available to TensorFlow you can use this and see if the GPU cards are available:

from tensorflow.python.client import device_lib print(device_lib.list_local_devices()) 

More info

There are also C++ logs available controlled by the TF_CPP_MIN_VLOG_LEVEL env variable, e.g.:

import os os.environ["TF_CPP_MIN_VLOG_LEVEL"] = "2" 

should allow them to be printed when running import tensorflow as tf.

You should see this kind of logs if you use GPU-enabled tensorflow with proper access to the GPU machine:

successfully opened CUDA library libcublas.so.*.* locally successfully opened CUDA library libcudnn.so.*.* locally successfully opened CUDA library libcufft.so.*.* locally 

On the other hand, if there are no CUDA libraries in the system / container, you will see:

Could not find cuda drivers on your machine, GPU will not be used. 

and where CUDA are installed, but there is no GPU physically available, TF will import cleanly and error only later, when you run device_lib.list_local_devices() with this:

failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 
Sign up to request clarification or add additional context in comments.

2 Comments

They are C++ logs and are controlled by the TF_CPP_MIN_VLOG_LEVEL env variable, e.g.: export TF_CPP_MIN_VLOG_LEVEL=2 should allow them to be printed when running import tensorflow as tf.
In my case, TF_CPP_MAX_VLOG_LEVEL=2 works instead of TF_CPP_MIN_VLOG_LEVEL=2.
22

It may sound dumb, but try reboot. It helped me and some other folks in GitHub.

3 Comments

Same here. WTF. Been struggling for two days, a single reboot helped :|
I love you. In my case it was probably due to not rebooting after changing the driver from X to NVidia.
I was using TensorFlow GPU for months and suddenly it stopped using the GPU. Reboot solved it. Thanks.
10

I was still having trouble getting GPU support even after correctly installing tensorflow-gpu via pip. My problem was that I had installed tensorflow 1.5, and CUDA 9.1 (the default version Nvidia directs you to), whereas the precompiled tensorflow 1.5 works with CUDA versions <= 9.0. Here is download page on nvidia's site to get the correct CUDA 9.0:

https://developer.nvidia.com/cuda-90-download-archive

Also make sure to update your cuDNN to a version compatible with CUDA 9.0 https://developer.nvidia.com/cudnn https://developer.nvidia.com/rdp/cudnn-download

Comments

2

If you happen to using Anaconda to manage your environments => uninstall all existing versions of tensorflow

pip uninstall tensorflow pip3 uninstall tensorflow 

Install tensorflow-gpu using conda

conda install tensorflow-gpu 

If you don't mind starting from a new environment tho the easiest way to do so without

conda create --name tf_gpu tensorflow-gpu 

creates a new conda environment with the name tf_gpu with tensorflow gpu installed

1 Comment

I believe the GPU-only version is now tensorflow and tensorflow-gpu is outdated. "For the CPU-only build use the pip package named tensorflow-cpu."
2

You may also have CUDA versions mismatch than needs to be solved one way or the other (downgrading / pinning tensorflow to the latest version supported by your system CUDA is arguably quicker, but only doing the opposite is future-proof).

To verify, check CUDA versions used in your installed Tensorflow package:

>>> import tensorflow as tf >>> tf.sysconfig.get_build_info()['cuda_version'] '11.8' 

... and compare it with the CUDA version installed on the host / in the container / VM:

>>> import os >>> os.system("nvcc --version") nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Sun_Feb_14_21:12:58_PST_2021 Cuda compilation tools, release 11.2, V11.2.152 Build cuda_11.2.r11.2/compiler.29618528_0 0 

More info

When tensorflow imports cleanly (without any warnings), but it detects only CPU on a GPU-equipped machine with CUDA libraries installed, then you may also have a CUDA versions mismatch between the pre-compiled tensorflow package wheel and the system / container-installed versions.

The above CUDA versions mismatch (v11.8 used during Tensorflow compilation vs. v11.2 CUDA compiler installed in the container) resulted in TF without GPU access, despite nvidia-smi loading correctly).

See also: Tensorflow CUDA compatibility table (tested build configurations):

Comments

2

If you have problems running Tensorflow in the GPU, you should check if you have good / any versions of CUDA and cuDNN installed.

These versions should be ideally exactly the same as those tested to work by the devs here. For example for tensorflow==2.8.0 you should have CUDA v11.2 and cuDNN v8.1.

Also, you should add CUDA /bin folder and /libnvvp to system PATH.

This answer is based on this tutorial Tensorflow 2021 install tutorial.

1 Comment

VERY good point indeed - TF is completely opposite to PyTorch (which comes with its own cuDNN library bundled in): TF will not complain at all (silently falling back to CPU) if you don't have cuDNN installed at all (e.g. using 11.8.0-devel-ubuntu22.04 container image instead of 11.8.0-cudnn8-devel-ubuntu22.04). What saved you 1.5 GB per image before, will later squander you a day or so in debug time).
1

For me the following worked.

I used conda environment, as python environment meant setting LD_LIBRARY_PATH and installing Cuda manually which is an another mess.

In the mentioned blog, he have installed cudatoolkit and cudann inside conda and then installed tensorflow-gpu later which fixed the problem.

P.S, as far as I read, cudatoolkit and cudann plays huge role in getting your code running on tensorflow-gpu.

Comments

1

I ran into a similar problem I had the follwing versions of tensor flow libraries.

tensorboard 2.4.1 pyhd8ed1ab_1 conda-forge tensorboard-plugin-wit 1.8.0 pyh44b312d_0 conda-forge tensorflow 2.4.1 py39hf3d152e_0 conda-forge tensorflow-base 2.4.1 py39h23a8cbf_0 conda-forge tensorflow-estimator 2.4.0 pyh9656e83_0 conda-forge tensorflow-gpu 2.4.1 h30adc30_0 

The same version of libraries were installed in another machine where it was able to utilise the GPU. The Cuda toolkit version and driver versions were the same in both machines( the machine where it was working and the one where it wasnt).

Turns out the reason was that tensorflow-gpu=2.4.1 is compatible with python version 3.8.10. Changing my python version to 3.8.10 and keeping all other things unchanged worked for me !

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.