4

I am trying to do a very basic example within CUDA. I would like to do a simple calculation on a list of floats.

vh[x] * k1 + k2

Currently I am trying this and it is not working:

Code 1

#include <vector> #include <iostream> #include <thrust/transform.h> #include <thrust/functional.h> #include <thrust/host_vector.h> #include <thrust/device_vector.h> using namespace std; using namespace thrust; float k1 = 42, k2 = 7; int main(void) { vector<float> vh = { 0, 1, 2, 3, 4, 5, 6, 7 }; device_vector<float> v = vh; device_vector<float> v_out(v.size()); thrust::transform(v.begin(), v.end(), v_out.begin(), [=] __device__(float x) { return x*k1 + k2; }); for (size_t i = 0; i < v_out.size(); i++) std::cout << v_out[i] << std::endl; } 

I am getting a very annoying lambda function error with the above code so I have tried to use a custom function as the code below shows:

Code 2

#include <vector> #include <iostream> #include <thrust/transform.h> #include <thrust/functional.h> #include <thrust/host_vector.h> #include <thrust/device_vector.h> using namespace std; using namespace thrust; float k1 = 42, k2 = 7; float multiply(float x) { return x * k1 + k2; } int main(void) { vector<float> vh = { 0, 1, 2, 3, 4, 5, 6, 7 }; device_vector<float> v = vh; device_vector<float> v_out(v.size()); thrust::negate<float> op; thrust::transform(v.begin(), v.end(), v_out.begin(), multiply __device__(float x) ); for (size_t i = 0; i < v_out.size(); i++) std::cout << v_out[i] << std::endl; std::getwchar(); } 

Can anyone tell my why Code 1 and/or Code 2 is not working?

2
  • What CUDA and thrust version are you using? Commented Jan 6, 2016 at 16:09
  • CUDA 7.5 and I guess it is Thrust 1.7.0. I havent updated it since I installed CUDA. Commented Jan 6, 2016 at 16:10

1 Answer 1

6

For Code 2, you must wrap your function in an object to create a functor.

For Code 1, you have to use --expt-extended-lambda nvcc option to enable full lambda support.

You also must declare k1, k2 as const, or not make them static by (for instance) declaring it inside the main.

Use functor for production code, unless your lambda is very simple.

Check out the following code for a working example:

#include <vector> #include <iostream> #include <thrust/transform.h> #include <thrust/functional.h> #include <thrust/host_vector.h> #include <thrust/device_vector.h> using namespace std; using namespace thrust; template<class T> struct saxpi{ T k1; T k2; saxpi(T _k1, T _k2){ k1=_k1; k2=_k2; } __host__ __device__ T operator()(T &x) const{ return x*k1+k2; } }; int main(void) { float kk1=1, kk2=5; vector<float> vh = { 0, 1, 2, 3, 4, 5, 6, 7 }; device_vector<float> v = vh; device_vector<float> v_out(v.size()); cout<<"Lambda:"<<endl; auto ff = [=] __device__ (float x) {return kk1*x +kk2;}; thrust::transform(v.begin(),v.end(),v_out.begin(),ff); for (size_t i = 0; i < v_out.size(); i++) std::cout << v_out[i] << std::endl; cout<<"Functor:"<<endl; saxpi<float> f(kk1,kk2); v_out.clear(); v_out.resize(v.size()); thrust::transform(v.begin(),v.end(),v_out.begin(),f); for (size_t i = 0; i < v_out.size(); i++) std::cout << v_out[i] << std::endl; } 

Compile it using the following options: --expt-extended-lambda -std=c++11

Lambda: 5 6 7 8 9 10 11 12 Functor: 5 6 7 8 9 10 11 12 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks alot! Can you explain the difference between host device and just device ?
_host_ _device_ is nothing but a shorthand for writing the function once and tell the compiler to produce two version of the same code: it produces x86 assembly for the CPU-side and PTX for the GPU.
This test example worked for me even without the __host__ qualifier

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.