I have an array of bool (or any equivalent structure). I need to count the number of true between the nth and mth position. Is it possible to get the compiler to vectorize this so that 64 or more elements are checked in one go?
2 Answers
That already happens if you compile with -O3 and potentially -march=native:
#include <algorithm> #include <iostream> void count(char* v) { auto x = std::count(&v[0], &v[100], true); std::cout << x << std::endl; } With gcc, this gives you a bunch of vp* instructions, which means that the count is vectorized. You can easily check this on godbolt.
2 Comments
Peter A
Is there a way to use bool instead of char to get 8 times as many operations? Thank you
mrks
std::vector<bool> does not seem to be parallelized, likely because of its proxy iterators. I'd look into bitpacking libraries and/or use simd popcount: github.com/WojciechMula/sse-popcountThere actually is a speciation of std::vector for bool: std::vector<bool> but it's not great and doesn't have a count function.
Instead you could consider using std:: bitset, that does offer a count member function. It's more or less designed for the purpose and will likely have an efficient implementation.