Efficient way to reduce a subset of vector using thrust/cuda

Question

Let us say, we have a thrust device vector of size 10^16 and another vector of size 10^8 containing some indices (not necessarily sorted). We want to sum all the elements of the first vector if it's index is in the second vector.

A naive approach to do so would be to use transform_reduce of thrust. However, I believe it'll involve iterating through all the elements of the first vector.

Is there an efficient way?

Probably a permutation iterator: thrust.github.io/doc/classthrust_1_1permutation__iterator.html — talonmies
– talonmies, Commented Mar 2, 2020 at 10:45
Thank you. Indeed permutation_iterator is what I need here, the code runs faster with it. — solver
– solver, Commented Mar 6, 2020 at 9:51
It would be good if you add a short answer explaining your solution for future visitors — talonmies
– talonmies, Commented Mar 6, 2020 at 11:16

solver · Accepted Answer · 2020-03-06 12:45:07Z

Following talonmies suggestion of permutation iterator, here is the main body of the code that implements reduction of a subset of a vector. I purposefully, chose small vector sizes to explain the idea. For reasonable sizes, it is faster than using inner_product

 thrust::device_vector<double> vals(6); vals[0] = 2.0; vals[1] = 1.5; vals[2] = -1.2; vals[3] = 1.1; vals[4] = -4.3; vals[5] = 0.8; thrust::device_vector<int> indices(3); indices[0] = 1; indices[1] = 3; indices[2] = 5; thrust::device_vector<double> masks(6); for (auto elm:indices) masks[elm]=1.0; typedef thrust::device_vector<double>::iterator ValIterator; typedef thrust::device_vector<int>::iterator IndIterator; thrust::permutation_iterator<ValIterator, IndIterator> iter_begin(vals.begin(), indices.begin()); thrust::permutation_iterator<ValIterator, IndIterator> iter_end(vals.end(), indices.end()); double sum_reduce = thrust::reduce(iter_begin, iter_end); std::cout << "sum permutation iterator: " << sum_reduce << std::endl; double sum_inner_product = thrust::inner_product(vals.begin(), vals.end(), masks.begin(), 0.0); std::cout << "sum inner product: " << sum_inner_product << std::endl;

Collectives™ on Stack Overflow

Efficient way to reduce a subset of vector using thrust/cuda

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related