Error compiling a cuda project

Question

I'm having some trouble compiling a cuda project with C Cuda and the lodepng libraries.

My makefile looks like this.

gpu: super-resolution.cu gcc -g -O -c lodepng.c nvcc -c super-resolution.cu nvcc -o super-resolution-cuda super-resolution.o rm -rf super-resolution.o rm -rf lodepng.o

Could anyone tell me what I am doing wrong, because it is complaining about

nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release. super-resolution.o: In function `main': parallel-algorithm/super-resolution.cu:238: undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)' parallel-algorithm/super-resolution.cu:259: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)' parallel-algorithm/super-resolution.cu:269: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)' parallel-algorithm/super-resolution.cu:282: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)' parallel-algorithm/super-resolution.cu:292: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)' parallel-algorithm/super-resolution.cu:301: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)' ...

I just need a way to compile my .cu file and add a C .o file into it during the compilation process using nvcc.

EDIT: tried suggestion. no success.

gcc -g -O -c lodepng.c nvcc -c super-resolution.cu nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release. super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated] #import "cuda.h" ^ super-resolution.cu(106): warning: expression has no effect super-resolution.cu(116): warning: expression has no effect super-resolution.cu(141): warning: variable "y" was declared but never referenced super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated] #import "cuda.h" ^ super-resolution.cu(106): warning: expression has no effect super-resolution.cu(116): warning: expression has no effect super-resolution.cu(141): warning: variable "y" was declared but never referenced ptxas /tmp/tmpxft_00000851_00000000-5_super-resolution.ptx, line 197; warning : Double is not supported. Demoting to float nvcc -o super-resolution-cuda super-resolution.o lodepng.o nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release. super-resolution.o: In function `main': tmpxft_00000851_00000000-3_super-resolution.cudafe1.cpp:(.text+0x5d): undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)'

It still can't find the reference to the object file. Edit: here's our .cu file.

#include <stdio.h> #include <stdlib.h> #include <math.h> #include <cstdio> extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h); extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );

You're linking with nvcc but not including the object you built with gcc (lodepng.o). try nvcc -o super-resolution-cuda super-resolution.o lodepng.o in place of your existing nvcc -o super-resolution-cuda super-resolution.o link step. — Robert Crovella
– Robert Crovella, Commented Apr 24, 2014 at 19:32
so the lodepng_encode32_file reference got sorted out but the lodepng_decode32_file reference did not? Would probably need to see the exact code then to understand why and if you are doing C/C++ linking correctly (e.g. extern C, etc.) Are you sure that both lodepng_encode32_file and lodepng_decode32_file are exported and used the same way? — Robert Crovella
– Robert Crovella, Commented Apr 24, 2014 at 20:14
Seriously? You're posting these updates in the comments? How about the corrsponding functions in lodepng? Do their prototypes match? I'm not sure this question can be answered with little snippets. Create a complete, short, compilable case in the question that demonstrates the issue. Yes, you will need to edit your files down substantially, it will require effort on your part. But it should not be difficult, to edit 2 files down to the critical pieces that demonstrate the issue. — Robert Crovella
– Robert Crovella, Commented Apr 24, 2014 at 20:28

Robert Crovella · Accepted Answer · 2014-04-24 21:23:02Z

don't #import. If you want to include cuda.h (which should be unnecessary) then use #include. Instead I would just delete that line from your super-resolution.cu file.
What you did not show before, but is now evident, is that in your super-resolution.cu you are including lodepng.h and also later specifying C-linkage for 2 functions: lodepng_decode32_file and lodepng_encode32_file. When I tried compiling your super-resolution.cu the compiler gave me errors like this (I don't know why you don't see them):
```
super-resolution.cu(8): error: linkage specification is incompatible with previous "lodepng_encode32_file" lodepng.h(184): here super-resolution.cu(9): error: linkage specification is incompatible with previous "lodepng_decode32_file" lodepng.h(134): here 
```
So basically you are tripping over C and C++ linkage.

I believe the simplest solution is to use lodepng.cpp (instead of lodepng.c), delete the following lines from your super-resolution.cu:

extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h); extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );

And just compile everything and link everything c++ style:

$ g++ -c lodepng.cpp $ nvcc -c super-resolution.cu nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release. $ nvcc -o super-resolution super-resolution.o lodepng.o nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release. $

If you really want to link lodepng.o c-style instead of c++ style, then you will need to modify lodepng.h with appropriate extern "C" wrappers where the necessary functions are called out. In my opinion this gets messy.
If you want to get rid of the warnings about sm_10 then add the nvcc switch to compile for a different architecture, e.g.:
```
nvcc -arch=sm_20 ... 
```
but make sure whatever you choose is compatible with your GPU.

You don't need to change to lodepng.cpp, if the code was C then you should use a C compiler. Just have to straighten out the linkage specification.

oz123 · Accepted Answer · 2014-04-25 04:45:32Z

Here is a simple snippet of the code.

The lodepng library can be gotten from here (http://lodev.org/lodepng/).

Renaming it to C will make it usable on C.

Even at this level, there's compilation issues with

"undefined reference to `lodepng_decode32_file'" "undefined reference to `lodepng_encode32_file'"

File: Makefile

all: gpu gcc -g -O -c lodepng.c nvcc -c super-resolution.cu nvcc -o super-resolution-cuda super-resolution.o lodepng.o rm -rf super-resolution.o rm -rf lodepng.o

File: super-resolution.cu

#import "cuda.h" #include "lodepng.h" #include <stdio.h> #include <stdlib.h> #include <math.h> #include <cstdio> extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h); extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* ); //GPU 3x3 Blur. __global__ void gpuBlur(unsigned char* image, unsigned char* buffer, int width, int height) { int i = threadIdx.x%width; int j = threadIdx.x/width; if (i == 0 || j == 0 || i == width - 1 || j == height - 1) return; int k; for (k = 0; k <= 4; k++) { buffer[4*width*j + 4*i + k] = (image[4*width*(j-1) + 4*(i-1) + k] + image[4*width*(j-1) + 4*i + k] + image[4*width*(j-1) + 4*(i+1) + k] + image[4*width*j + 4*(i-1) + k] + image[4*width*j + 4*i + k] + image[4*width*j + 4*(i+1) + k] + image[4*width*(j+1) + 4*(i-1) + k] + image[4*width*(j+1) + 4*i + k] + image[4*width*(j+1) + 4*(i+1) + k])/9; } } int main(int argc, char *argv[]) { //Items for image processing; //int threshold = 100; unsigned int error; unsigned char* image; unsigned int width, height; //Load the image; if (argc > 1) { error = lodepng_decode32_file(&image, &width, &height, argv[1]); printf("Loaded file: %s[%d]\n", argv[1], error); } else { return 0; } unsigned char* buffer =(unsigned char*)malloc(sizeof(char) * 4*width*height); //GPU Blur Section. unsigned char* image_gpu; unsigned char* blur_gpu; cudaMalloc( (void**) &image_gpu, sizeof(char) * 4*width*height); cudaMalloc( (void**) &blur_gpu, sizeof(char) * 4*width*height); cudaMemcpy(image_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice); cudaMemcpy(blur_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice); gpuBlur<<< 1, height*width >>> (image_gpu, blur_gpu, width, height); cudaMemcpy(buffer, blur_gpu, sizeof(char) * 4*width*height, cudaMemcpyDeviceToHost); //Spit out buffer as an image. error = lodepng_encode32_file("GPU_OUTPUT1_Blur.png", buffer, width, height); cudaFree(image_gpu); cudaFree(blur_gpu); free(buffer); free(image); }

It's better to edit this type of material into your question. Don't post an answer on SO unless you are really answering the question (even if it is your own question).

Collectives™ on Stack Overflow

Error compiling a cuda project

2 Answers 2

1 Comment

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Related