0

I want to do parallel reduction, but inside my kernel with data in shared memory. Is this possible with thrust library ? Something like

int sum = thrust::reduce(myIntArray, myIntArray+numberOfItems, (int) 0, thrust::max_element<int>()); 

But this doesn't work inside kernel. Is it possible? Thank you.

1

1 Answer 1

1

No, thrust::reduce() is a host function that results in the execution of CUDA kernels if the data is on the GPU.

You would have to dig into the thrust source and find the __device__ functions it uses for reduction. Those would be callable from your kernel. If the logic for reduction is contained in other __global__ kernels, you'll have to piece it together manually in order to use it.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.