3

I've tried to install CUDA on three different VMs but have been unsuccessful in getting it to recognize my GPU.

I am using an Azure VM (Standard NV6) with an M60 GPU.

With a fresh VM I run the following commands taken from this guide:

wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1404-8-0-local-ga2_8.0.61-1_amd64-deb sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb sudo apt-get update sudo apt-get install -y cuda 

It appears to run successful and doesn't indicate that there were any problems. But when I run

nvidia-smi 

I receive the following:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running 

I have tried with 16.04 LTS and various other GPU instances. Google tells me others are using these Azure GPU instances with Tensorflow, so it doesn't appear to be an issue with the graphics card.

Finally, I have reviewed what seems to be the canonical guide to installing CUDA on Ubuntu but it fails when running

sudo ./NVIDIA-Linux-x86_64-331.62.run 

enter image description here

The message in the log file:

ERROR: Unable to load the 'nvidia-drm' kernel module. 

My Question

What is the most reliable method for installing CUDA 8 on Ubuntu 14.04 LTS?

Are there any special precauations that I need to take when running CUDA on a VM?

Edit: Additional Info

uname -a returns

Linux 2017-02-21-josh-gpu 4.4.0-64-generic #85~14.04.1-Ubuntu SMP Mon Feb 20 12:10:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux 

lsmod returns

Module Size Used by drm_kms_helper 151552 0 drm 360448 1 drm_kms_helper syscopyarea 16384 1 drm_kms_helper sysfillrect 16384 1 drm_kms_helper sysimgblt 16384 1 drm_kms_helper fb_sys_fops 16384 1 drm_kms_helper udf 90112 0 crc_itu_t 16384 1 udf dm_crypt 28672 0 joydev 20480 0 hid_generic 16384 0 hid_hyperv 16384 0 hid 118784 2 hid_hyperv,hid_generic hyperv_keyboard 16384 0 hv_balloon 24576 0 input_leds 16384 0 serio_raw 16384 0 hv_netvsc 40960 0 hv_storvsc 20480 2 hv_utils 28672 2 scsi_transport_fc 65536 1 hv_storvsc crct10dif_pclmul 16384 0 crc32_pclmul 16384 0 ghash_clmulni_intel 16384 0 hyperv_fb 20480 1 aesni_intel 167936 0 aes_x86_64 20480 1 aesni_intel lrw 16384 1 aesni_intel gf128mul 16384 1 lrw glue_helper 16384 1 aesni_intel ablk_helper 16384 1 aesni_intel cryptd 20480 3 ghash_clmulni_intel,aesni_intel,ablk_helper psmouse 126976 0 hv_vmbus 90112 7 hv_balloon,hyperv_keyboard,hv_netvsc,hid_hyperv,hv_utils,hyperv_fb,hv_storvsc floppy 73728 0 
2
  • What sort of a VM does the Azure service provide? What do uname -a and lsmod report? Commented Feb 23, 2017 at 10:13
  • I've appended that info near the end of my question. Let me know if you need anything else. Commented Feb 24, 2017 at 6:03

1 Answer 1

3

The official Azure documentation points out:

Currently, Linux GPU support is only available on Azure NC VMs running Ubuntu Server 16.04 LTS.+

I'm not sure why they even let you create GPU instances with 14.04 installed, but hopefully this will help spread the word.

After creating a fresh 16.04 instance I did the following:

First, I had to uninstall/blacklist the Nouveau drivers that come pre-installed on Ubuntu 16.04. They're not compatible with the NVIDIA drivers we're trying to install and will cause errors later on if we don't remove them.

 sudo nano /etc/modprobe.d/blacklist.conf 

At the bottom of the file add the following entries:

 amd76x_edac #this might not be required for x86 32 bit users. blacklist vga16fb blacklist nouveau blacklist rivafb blacklist nvidiafb blacklist rivatv 

Reboot VM with sudo reboot

I downloaded the drivers directly from Microsoft, but you can substitute with your preferred source:

wget -O NVIDIA-Linux-x86_64-384.73-grid.run https://go.microsoft.com/fwlink/?linkid=849941 chmod +x NVIDIA-Linux-x86_64-384.73-grid.run sudo ./NVIDIA-Linux-x86_64-384.73-grid.run 

I just clicked through the default selected options in the runfile.

Verify driver installation by running nvidia-smi

Install CUDA Toolkit 8

CUDA_REPO_PKG=cuda-repo-ubuntu1604_8.0.44-1_amd64.deb wget -O /tmp/${CUDA_REPO_PKG} http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/${CUDA_REPO_PKG} sudo dpkg -i /tmp/${CUDA_REPO_PKG} rm -f /tmp/${CUDA_REPO_PKG} sudo apt-get update sudo apt-get install cuda-drivers 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.