CUDA "invalid argument" error when starting kernel

Question

I have a problem when kernel launches. I launch a kernel with a grid size of (3000000, 16), and CUDA reports an "invalid argument" runtime error here. I tried different maxPixelCount value and found: when maxPixelCount is 200000, the error is reported, while when it's 50000, it continues without error.

dim3 dimGrid(maxPixelCount, imageCount); printf("grid: %d * %d * %d", dimGrid.x, dimGrid.y, dimGrid.z); mcudaGetGrayDataKernel <<< dimGrid, 1 >>> (deviceDestDataPtrs, deviceImageDataPtrs, deviceSizes); cudaStatus = cudaGetLastError(); if (cudaStatus != cudaSuccess) { printf("cuda start kernel error\n%s", cudaGetErrorString(cudaStatus); goto Error; }

I checked the max grid size to ensure my card's ability, using the following sentence:

printf(" - max grid size: %d * %d * %d\n", prop.maxGridSize[0], prop.maxGridSize[1], prop.maxGridSize[2]);

I got the following message:

- max grid size: 2147483647 * 65535 * 65535

I think this means my dim is in the proper range. But why does the error appears?

My IDE is Visual Studio 2013

This problem has been solved. To reach the max limit of grid size, the Device->Code Generation option has to be set to the proper version. For my GPU I modified it to compute_30,sm_30.

how are you compiling your code? If you compile for cc2.0 (default for CUDA 6.5, 7, 7.5) then you will have a lower limit of 65535 imposed. To get the higher limit of 2147483647 you need to compile for a cc3.0 or higher device. A switch like -arch=sm_30 on the compile command line might be all you need. — Robert Crovella
– Robert Crovella, Commented Dec 17, 2015 at 17:36
I'm using Visual Studio 2013. I modified the project CUDA C/C++ properties: In Device, I modified the Code Generation option to compute_20,sm_30. In Host, I modified the Additional Compiler Options option to -arch=sm_30. But the problem still remains. And a compiling warning was reported: 1>cl : command line warning D9002: ignored unknown option "-arch=sm_30" — Cosmo
– Cosmo, Commented Dec 18, 2015 at 6:15
compute_20,sm_30 won't work. You should choose compute_30,sm_30 And you seem to have changed more than just the code generation option (wherever you added -arch=sm_30, remove that). Since you're struggling with this, you could also just take your code and drop it into the vectorAdd cuda sample project, and compile it there. And of course you will need a cc3.0 or higher GPU to run it on. — Robert Crovella
– Robert Crovella, Commented Dec 18, 2015 at 6:34
That solves the problem, thank you! BTW, what do compute and sm mean respectly? — Cosmo
– Cosmo, Commented Dec 18, 2015 at 7:29
" what do compute and sm mean respectly? " This answer and this answer should help with that. — Robert Crovella
– Robert Crovella, Commented Dec 18, 2015 at 15:32

Community · Accepted Answer · 2017-05-23 12:23:14Z

This formulation:

dim3 dimGrid(maxPixelCount, imageCount);

places maxPixelCount in the .x dimension of the variable (dimGrid) that will be used to specify the grid dimensions of the kernel launch:

mcudaGetGrayDataKernel <<< dimGrid, 1 >>> ...

By referring to the programming guide (or you can use the deviceQuery sample code, or query the data yourself programmatically), we can see that devices of compute capability 2.0 only support up to a 65535 limit on the .x dimension of the grid. In order to achieve the larger dimension (2^31 - 1) available in a compute capability 3.0 (or higher) device, it's necessary to:

compile for a compute capability 3.0+ device <and>
run the code on a compute capability 3.0+ device.

There are various methods to specify how to compile for a compute capability 3.0 device. Most of the CUDA sample code projects demonstrate this for windows and linux (Makefile) projects. For more information on how to compile for a given device architecture, and what the various switches mean, refer to this answer and this answer and the relevant section of the nvcc manual.

Collectives™ on Stack Overflow

CUDA "invalid argument" error when starting kernel

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related