Why is CUDALink failing?

Question

I have a new windows 10 laptop with a discrete nvidia graphics card (all drivers up to date). The first example from the documention of CUDAFunctionLoad[] in v11.2 fails as shown below, even though CUDAQ[] evaluates to True and OpenCLLink appears to be working fine....

Needs["CUDALink`"] code = " __global__ void addTwo(mint * in, mint * out, mint length) { int index = threadIdx.x + blockIdx.x*blockDim.x; if (index < length) out[index] = in[index] + 2; }"; cudaFun = CUDAFunctionLoad[code, "addTwo", {{_Integer, _, "Input"}, {_Integer, _, "Output"}, _Integer}, 256, "ShellOutputFunction" -> Print]

Here's the output: (note the message nvcc fatal: Host compiler targets unsupported OS.)

In[3]:= cudaFun = CUDAFunctionLoad[code, "addTwo", {{_Integer, _, "Input"}, {_Integer, _, "Output"}, _Integer}, 256, "ShellOutputFunction" -> Print, "ShellCommandFunction" :> Print] During evaluation of In[3]:= CUDAFunctionLoad::cmpf: The kernel compilation failed. Consider setting the option "ShellOutputFunction"->Print to display the compiler error message. During evaluation of In[3]:= call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64 "C:\Users\micha\AppData\Roaming\Mathematica\Paclets\Repository\CUDAResources-Win64-11.2.22\CUDAToolkit\bin\nvcc.exe" -cubin -m64 -arch=sm_61 -DUSING_CUDA_FUNCTION=1 -Dmint="long long" -DReal_t=double -DUSING_DOUBLE_PRECISIONQ=1 -o "C:\Users\micha\AppData\Roaming\Mathematica\ApplicationData\CUDALink\BuildFolder\sbp-14988\Working-sbp-14988-15020-1\CUDAFunction-1595.cubin" "C:\Users\micha\AppData\Roaming\Mathematica\ApplicationData\CUDALink\BuildFolder\sbp-14988\Working-sbp-14988-15020-1\CUDAFunction-1595.cu" During evaluation of In[3]:= C:\Users\micha\AppData\Roaming\Mathematica\ApplicationData\CUDALink\BuildFolder\sbp-14988\Working-sbp-14988-15020-1>call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64 ********************************************************************** ** Visual Studio 2017 Developer Command Prompt v15.5.2 ** Copyright (c) 2017 Microsoft Corporation ********************************************************************** [vcvarsall.bat] Environment initialized for: 'x64' Microsoft (R) C/C++ Optimizing Compiler Version 19.12.25831 for x64 Copyright (C) Microsoft Corporation. All rights reserved. tmpxft_00000f74_00000000-1.cpp **nvcc fatal : Host compiler targets unsupported OS.** Out[3]= CUDAFunctionLoad[" __global__ void addTwo(mint * in, mint * out, mint length) { int index = threadIdx.x + blockIdx.x*blockDim.x; if (index < length) out[index] = in[index] + 2; }", "addTwo", {{_Integer, _, "Input"}, {_Integer, _, "Output"}, _Integer}, 256, "ShellOutputFunction" -> Print, "ShellCommandFunction" :> Print]

Here's the result of my CUDAResourcesInformation[]:

{{"Name" -> "CUDAResources", "Version" -> "11.2.22", "BuildNumber" -> "", "Qualifier" -> "Win64", "WolframVersion" -> "11.2", "SystemID" -> {"Windows-x86-64"}, "Description" -> "{ToolkitVersion -> v8.0, MinimumDriver -> 290}", "Category" -> "", "Creator" -> "", "Publisher" -> "", "Support" -> "", "Internal" -> False, "Location" -> "C:\\Users\\micha\\AppData\\Roaming\\Mathematica\\Paclets\\\ Repository\\CUDAResources-Win64-11.2.22", "Context" -> {}, "Enabled" -> True, "Loading" -> Manual, "Hash" -> "9c9b3bf9dfc07e0cc2376f0ef13e01d5"}}

Machine Details: I'm on Windows 10 Pro v1703 with an Nvidia GeForce 2016 card, Cuda 9.1 Toolkit, and Visual Studio installed.

References:

The basic issue (for the current revision of the question) is that CUDA Toolkit 8 is incompatible with VS 2017... in principle this can be resolved by either using an older VS version, or by pointing CUDALink to use the 9.1 Toolkit (done through the XCompilerInstallation option for the C compiler location or the CompilerInstallation option for the toolkit location). — ilian
– ilian, Commented Jan 7, 2018 at 5:21

ilian · Accepted Answer · 2018-03-20 20:21:59Z

This should be working better with Visual Studio 2017 and recent Nvidia cards after updating to the latest CUDAResources for version 11.3:

PacletSiteUpdate /@ PacletSites[]; Needs["CUDALink`"] CUDAResourcesInstall[Update -> True] (* {Paclet[CUDAResources, 11.3.38, <>]} *)

halirutan · Accepted Answer · 2018-01-07 01:26:24Z

The situation is that you (and me) have a card that is too new. You can try the following: Set the option "ShellCommandFunction" :> Print and run again. There, you should find the location of the nvcc, the cuda compiler. You can call this compiler on a comandline with --help. As usually a terminal is considered the evil gate to hell on a Windoze machine, here is the output:

--gpu-architecture (-arch)
Specify the name of the class of NVIDIA 'virtual' GPU architecture for which the CUDA input files must be compiled. [...] Allowed values for this option: compute_20', 'compute_30', 'compute_32', 'compute_35', 'compute_37', 'compute_50', 'compute_52', 'compute_53', 'sm_20', 'sm_21', 'sm_30', 'sm_32', 'sm_35', 'sm_37', 'sm_50', 'sm_52', 'sm_53'.

The version of the CUDA Toolkit is on my machine 7.5, while currently version 9.1 is available. I don't bother that much with CUDA from within Mathematica these days, but this is the reason.

Fix

I tested it on Linux with Mathematica Version 11.2. First, you need to run

CUDAResourcesInstall[Update -> True]

Then, there should be a new paclet in your paclet-directory. For me, this is in my home directory under

~/.Mathematica/Paclets/Repository/CUDAResources-Lin64-11.2.63

Restart your Mathematica and after that, you should be able to successfully build the CUDA kernel.

I'll try that command and post the output. Did you run it in "Bash for Ubuntu on Windows" or in Bash.exe? But does this mean there's no fix? And who would the fix be incumbent on, I'm guessing it's on Mathematica's side (i.e. CUDALink needs to be updated)? — M.R.
– M.R., Commented Jan 7, 2018 at 0:56
I'm on Ubuntu where I have a real terminal. I'm updating at the moment and report back in some minutes. — halirutan
– halirutan, Commented Jan 7, 2018 at 0:57
Looks like its up to WRI to supply CUDAResources that can work with modern gpu architectures... these are all old: wolfram.com/CUDA/CUDAResources.html — M.R.
– M.R., Commented Jan 7, 2018 at 1:17
What about in windows? Do you have to give the location of the new paclet or delete the old one? — M.R.
– M.R., Commented Jan 7, 2018 at 1:33

Stack Exchange Network

Why is CUDALink failing?

2 Answers 2

Fix

Hot Network Questions

Why is CUDALink failing?

2 Answers 2

Fix

Related

Hot Network Questions