1029

8 bits representing the number 7 look like this:

00000111 

Three bits are set.

What are the algorithms to determine the number of set bits in a 32-bit integer?

12
  • 126
    This is the Hamming weight BTW. Commented Sep 20, 2008 at 19:17
  • 14
    What's a real-world application for this? (This isn't to be taken as a criticism--I'm just curious.) Commented Dec 10, 2010 at 20:59
  • 11
    Calculation of parity bit (look it up), which was used as simple error detection in communication. Commented Dec 11, 2010 at 0:28
  • 9
    @Dialecticus, calculating a parity bit is cheaper than calculating the Hamming weight Commented May 12, 2011 at 12:14
  • 18
    @spookyjon Let's say you have a graph represented as an adjacency matrix, which is essentially a bit set. If you want to calculate the number of edges of a vertex, it boils down to calculating the Hamming weight of one row in the bit set. Commented Oct 10, 2011 at 16:02

66 Answers 66

5

Here is a portable module ( ANSI-C ) which can benchmark each of your algorithms on any architecture.

Your CPU has 9 bit bytes? No problem :-) At the moment it implements 2 algorithms, the K&R algorithm and a byte wise lookup table. The lookup table is on average 3 times faster than the K&R algorithm. If someone can figure a way to make the "Hacker's Delight" algorithm portable feel free to add it in.

#ifndef _BITCOUNT_H_ #define _BITCOUNT_H_ /* Return the Hamming Weight of val, i.e. the number of 'on' bits. */ int bitcount( unsigned int ); /* List of available bitcount algorithms. * onTheFly: Calculate the bitcount on demand. * * lookupTable: Uses a small lookup table to determine the bitcount. This * method is on average 3 times as fast as onTheFly, but incurs a small * upfront cost to initialize the lookup table on the first call. * * strategyCount is just a placeholder. */ enum strategy { onTheFly, lookupTable, strategyCount }; /* String represenations of the algorithm names */ extern const char *strategyNames[]; /* Choose which bitcount algorithm to use. */ void setStrategy( enum strategy ); #endif 

.

#include <limits.h> #include "bitcount.h" /* The number of entries needed in the table is equal to the number of * unique values a char can represent which is always UCHAR_MAX + 1 */ static unsigned char _bitCountTable[UCHAR_MAX + 1]; static unsigned int _lookupTableInitialized = 0; static int _defaultBitCount( unsigned int val ) { int count; /* Starting with: * 1100 - 1 == 1011, 1100 & 1011 == 1000 * 1000 - 1 == 0111, 1000 & 0111 == 0000 */ for ( count = 0; val; ++count ) val &= val - 1; return count; } /* Looks up each byte of the integer in a lookup table. * * The first time the function is called it initializes the lookup table. */ static int _tableBitCount( unsigned int val ) { int bCount = 0; if ( !_lookupTableInitialized ) { unsigned int i; for ( i = 0; i != UCHAR_MAX + 1; ++i ) _bitCountTable[i] = ( unsigned char )_defaultBitCount( i ); _lookupTableInitialized = 1; } for ( ; val; val >>= CHAR_BIT ) bCount += _bitCountTable[val & UCHAR_MAX]; return bCount; } static int ( *_bitcount ) ( unsigned int ) = _defaultBitCount; const char *strategyNames[] = { "onTheFly", "lookupTable" }; void setStrategy( enum strategy s ) { switch ( s ) { case onTheFly: _bitcount = _defaultBitCount; break; case lookupTable: _bitcount = _tableBitCount; break; case strategyCount: break; } } /* Just a forwarding function which will call whichever version * of the algorithm has been selected by the client */ int bitcount( unsigned int val ) { return _bitcount( val ); } #ifdef _BITCOUNT_EXE_ #include <stdio.h> #include <stdlib.h> #include <time.h> /* Use the same sequence of pseudo random numbers to benchmark each * Hamming Weight algorithm. */ void benchmark( int reps ) { clock_t start, stop; int i, j; static const int iterations = 1000000; for ( j = 0; j != strategyCount; ++j ) { setStrategy( j ); srand( 257 ); start = clock( ); for ( i = 0; i != reps * iterations; ++i ) bitcount( rand( ) ); stop = clock( ); printf ( "\n\t%d pseudo-random integers using %s: %f seconds\n\n", reps * iterations, strategyNames[j], ( double )( stop - start ) / CLOCKS_PER_SEC ); } } int main( void ) { int option; while ( 1 ) { printf( "Menu Options\n" "\t1.\tPrint the Hamming Weight of an Integer\n" "\t2.\tBenchmark Hamming Weight implementations\n" "\t3.\tExit ( or cntl-d )\n\n\t" ); if ( scanf( "%d", &option ) == EOF ) break; switch ( option ) { case 1: printf( "Please enter the integer: " ); if ( scanf( "%d", &option ) != EOF ) printf ( "The Hamming Weight of %d ( 0x%X ) is %d\n\n", option, option, bitcount( option ) ); break; case 2: printf ( "Please select number of reps ( in millions ): " ); if ( scanf( "%d", &option ) != EOF ) benchmark( option ); break; case 3: goto EXIT; break; default: printf( "Invalid option\n" ); } } EXIT: printf( "\n" ); return 0; } #endif 
Sign up to request clarification or add additional context in comments.

1 Comment

I like very much your plug-in, polymorphic approach, as well as the switch to build as a reusable library or stand-alone, test executable. Very well thought =)
4

32-bit or not ? I just came with this method in Java after reading "cracking the coding interview" 4th edition exercice 5.5 ( chap 5: Bit Manipulation). If the least significant bit is 1 increment count, then right-shift the integer.

public static int bitCount( int n){ int count = 0; for (int i=n; i!=0; i = i >> 1){ count += i & 1; } return count; } 

I think this one is more intuitive than the solutions with constant 0x33333333 no matter how fast they are. It depends on your definition of "best algorithm" .

2 Comments

after reading others, it's similar to paxdiablo's answer . I agree on "readability over cleverness any time".
In bitCount(), the for loop never terminates when n < 0.
4

Naive Solution

Time Complexity is O(no. of bits in n)

int countSet(unsigned int n) { int res=0; while(n!=0){ res += (n&1); n >>= 1; // logical right shift, like C unsigned or Java >>> } return res; } 

Brian Kerningam's algorithm

Time Complexity is O(no of set bits in n)

int countSet(unsigned int n) { int res=0; while(n != 0) { n = (n & (n-1)); res++; } return res; } 

Lookup table method for 32-bit number- In this method we break the 32-bit number into chunks of four, 8-bit numbers

Time Complexity is O(1)

static unsigned char table[256]; /* the table size is 256, the number of values i&0xFF (8 bits) can have */ void initialize() //holds the number of set bits from 0 to 255 { table[0]=0; for(unsigned int i=1;i<256;i++) table[i]=(i&1)+table[i>>1]; } int countSet(unsigned int n) { // 0xff is hexadecimal representation of 8 set bits. int res=table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; return res; } 

Comments

3

Python solution:

def hammingWeight(n: int) -> int: sums = 0 while (n!=0): sums+=1 n = n &(n-1) return sums 

In the binary representation, the least significant 1-bit in n always corresponds to a 0-bit in n - 1. Therefore, anding the two numbers n and n - 1 always flips the least significant 1-bit in n to 0, and keeps all other bits the same.

Enter image description here

Comments

2

Personally I use this :

 public static int myBitCount(long L){ int count = 0; while (L != 0) { count++; L ^= L & -L; } return count; } 

1 Comment

I like this method. It's not that hard to understand, there is only one like that really needs any comment. And when you understand that one line it is like a little gem that you have been given. (What was in my coffee this morning?) I have used this without the loop to check if only one single flag bit is set - i.e. process one way if only one flag bit is set, some other (more tedious) way if multiple flags are set. (edit: formatting)
2
int countBits(int x) { int n = 0; if (x) do n++; while(x=x&(x-1)); return n; } 

Or also:

int countBits(int x) { return (x)? 1+countBits(x&(x-1)): 0; } 

7 1/2 years after my original answer, @PeterMortensen questioned if this was even valid C syntax. I posted a link to an online compiler showing that it is in fact perfectly valid syntax (code below).

#include <stdio.h> int countBits(int x) { int n = 0; if (x) do n++; /* Totally Normal Valid code. */ while(x=x&(x-1)); /* Nothing to see here. */ return n; } int main(void) { printf("%d\n", countBits(25)); return 0; } 

Output:

3 

If you want to re-write it for clarity, it would look like:

if (x) { do { n++; } while(x=x&(x-1)); } 

But that seems excessive to my eye.

However, I've also realized the function can be made shorter, but perhaps more cryptic, written as:

int countBits(int x) { int n = 0; while (x) x=(n++,x&(x-1)); return n; } 

2 Comments

In which language is if (x) do n++; accepted? Does it actually compile?
1

Here is a solution that has not been mentioned so far, using bitfields. The following program counts the set bits in an array of 100000000 16-bit integers using 4 different methods. Timing results are given in parentheses (on MacOSX, with gcc -O3):

#include <stdio.h> #include <stdlib.h> #define LENGTH 100000000 typedef struct { unsigned char bit0 : 1; unsigned char bit1 : 1; unsigned char bit2 : 1; unsigned char bit3 : 1; unsigned char bit4 : 1; unsigned char bit5 : 1; unsigned char bit6 : 1; unsigned char bit7 : 1; } bits; unsigned char sum_bits(const unsigned char x) { const bits *b = (const bits*) &x; return b->bit0 + b->bit1 + b->bit2 + b->bit3 \ + b->bit4 + b->bit5 + b->bit6 + b->bit7; } int NumberOfSetBits(int i) { i = i - ((i >> 1) & 0x55555555); i = (i & 0x33333333) + ((i >> 2) & 0x33333333); return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24; } #define out(s) \ printf("bits set: %lu\nbits counted: %lu\n", 8*LENGTH*sizeof(short)*3/4, s); int main(int argc, char **argv) { unsigned long i, s; unsigned short *x = malloc(LENGTH*sizeof(short)); unsigned char lut[65536], *p; unsigned short *ps; int *pi; /* set 3/4 of the bits */ for (i=0; i<LENGTH; ++i) x[i] = 0xFFF0; /* sum_bits (1.772s) */ for (i=LENGTH*sizeof(short), p=(unsigned char*) x, s=0; i--; s+=sum_bits(*p++)); out(s); /* NumberOfSetBits (0.404s) */ for (i=LENGTH*sizeof(short)/sizeof(int), pi=(int*)x, s=0; i--; s+=NumberOfSetBits(*pi++)); out(s); /* populate lookup table */ for (i=0, p=(unsigned char*) &i; i<sizeof(lut); ++i) lut[i] = sum_bits(p[0]) + sum_bits(p[1]); /* 256-bytes lookup table (0.317s) */ for (i=LENGTH*sizeof(short), p=(unsigned char*) x, s=0; i--; s+=lut[*p++]); out(s); /* 65536-bytes lookup table (0.250s) */ for (i=LENGTH, ps=x, s=0; i--; s+=lut[*ps++]); out(s); free(x); return 0; } 

While the bitfield version is very readable, the timing results show that it is over 4x slower than NumberOfSetBits(). The lookup-table based implementations are still quite a bit faster, in particular with a 65 kB table.

2 Comments

Note that this is a microbenchmark and should be taken with a grain of salt. For example, setting up the lookup table just before using it is priming the cache in a way that may be rare in real code.
Re "4x slower ... quite a bit faster": That needs to be qualified. For example, is it true for an AVR microcontroller? What kind of system of presumed? An AMD Ryzen 3950X?
1
int bitcount(unsigned int n) { int count=0; while(n) { count += n & 0x1u; n >>= 1; } return count; } 

Iterated 'count' runs in time proportional to the total number of bits. It simply loops through all the bits, terminating slightly earlier because of the while condition. Useful, if 1'S or the set bits are sparse and among the least significant bits.

Comments

1

In Java 8 or 9 just invoke Integer.bitCount .

1 Comment

How does this answer What are the algorithms to [popcount]?
1

You can use built in function named __builtin_popcount(). There is no__builtin_popcount in C++ but it is a built in function of GCC compiler. This function return the number of set bit in an integer.

int __builtin_popcount (unsigned int x); 

Reference : Bit Twiddling Hacks

1 Comment

How does this answer What are the algorithms to [popcount]?
1

From Python 3.10 onwards, you will be able to use the int.bit_count() function, but for the time being, you can define this function yourself.

def bit_count(integer): return bin(integer).count("1") 

Comments

1

Another Hamming weight algorithm if you're on a BMI2 capable CPU:

the_weight = __tzcnt_u64(~_pext_u64(data[i], data[i])); 

4 Comments

(cram set bits to the low end, invert, count unset bits from the low end)
Fun, but of no practical value. All BMI2 CPUs have popcnt. pext same,same to pack the bits could be an interesting building-block for something else, but tzcnt and pext both run on the same port as popcnt on Intel CPUs, and pext is very slow on AMD. (agner.org/optimize). You can sort of emulate pext x,x with (1ULL << popcnt(x)) - 1, except for the x==0 case. x86 shifts can't shift out all the bits, because they mask the shift count, and you have to watch out for C undefined behaviour with out of range counts.
What compiler was it tried with? It seems to be compiler-specific (compiler extension __tzcnt_u64? Library function '__tzcnt_u64')? Can you add some context to the answer?
Sure. for MacOS and Linux #include <immintrin.h> for Visual Studio use #include <intrin.h>. Modern compilers should not need any specific linker flags. However let me know if you're stuck and I'll help you out.
1

I am providing one more unmentioned algorithm, called Parallel, taken from here. The nice point about it that it is generic, meaning that the code is the same for bit sizes 8, 16, 32, 64, and 128.

I checked the correctness of its values and timings on an amount of 2^26 numbers for bits sizes 8, 16, 32, and 64. See the timings below.

This algorithm is a first code snippet. The other two are mentioned here just for reference, because I tested and compared to them.

Algorithms are coded in C++, to be generic, but it can be easily adopted to old C.

#include <type_traits> #include <cstdint> template <typename IntT> inline size_t PopCntParallel(IntT n) { // https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel using T = std::make_unsigned_t<IntT>; T v = T(n); v = v - ((v >> 1) & (T)~(T)0/3); // temp v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp return size_t((T)(v * ((T)~(T)0/255)) >> (sizeof(T) - 1) * 8); // count } 

Below are two algorithms that I compared with. One is the Kernighan simple method with a loop, taken from here.

template <typename IntT> inline size_t PopCntKernighan(IntT n) { // http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan using T = std::make_unsigned_t<IntT>; T v = T(n); size_t c; for (c = 0; v; ++c) v &= v - 1; // Clear the least significant bit set return c; } 

Another one is using built-in __popcnt16()/__popcnt()/__popcnt64() MSVC's intrinsic (doc here). Or __builtin_popcount of CLang/GCC (doc here). This intrinsic should provide a very optimized version, possibly hardware:

#ifdef _MSC_VER // https://learn.microsoft.com/en-us/cpp/intrinsics/popcnt16-popcnt-popcnt64?view=msvc-160 #include <intrin.h> #define popcnt16 __popcnt16 #define popcnt32 __popcnt #define popcnt64 __popcnt64 #else // https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html #define popcnt16 __builtin_popcount #define popcnt32 __builtin_popcount #define popcnt64 __builtin_popcountll #endif template <typename IntT> inline size_t PopCntBuiltin(IntT n) { using T = std::make_unsigned_t<IntT>; T v = T(n); if constexpr(sizeof(IntT) <= 2) return popcnt16(uint16_t(v)); else if constexpr(sizeof(IntT) <= 4) return popcnt32(uint32_t(v)); else if constexpr(sizeof(IntT) <= 8) return popcnt64(uint64_t(v)); else static_assert([]{ return false; }()); } 

Below are the timings, in nanoseconds per one number. All timings are done for 2^26 random numbers. Timings are compared for all three algorithms and all bit sizes among 8, 16, 32, and 64. In sum, all tests took 16 seconds on my machine. The high-resolution clock was used.

08 bit Builtin 8.2 ns 08 bit Parallel 8.2 ns 08 bit Kernighan 26.7 ns 16 bit Builtin 7.7 ns 16 bit Parallel 7.7 ns 16 bit Kernighan 39.7 ns 32 bit Builtin 7.0 ns 32 bit Parallel 7.0 ns 32 bit Kernighan 47.9 ns 64 bit Builtin 7.5 ns 64 bit Parallel 7.5 ns 64 bit Kernighan 59.4 ns 128 bit Builtin 7.8 ns 128 bit Parallel 13.8 ns 128 bit Kernighan 127.6 ns 

As one can see, the provided Parallel algorithm (first among three) is as good as MSVC's/CLang's intrinsic.


For reference, below is full code that I used to test speed/time/correctness of all functions.

As a bonus this code (unlike short code snippets above) also tests 128 bit size, but only under CLang/GCC (not MSVC), as they have unsigned __int128.

Try it online!

#include <type_traits> #include <cstdint> using std::size_t; #if defined(_MSC_VER) && !defined(__clang__) #define IS_MSVC 1 #else #define IS_MSVC 0 #endif #if IS_MSVC #define HAS128 false #else using int128_t = __int128; using uint128_t = unsigned __int128; #define HAS128 true #endif template <typename T> struct UnSignedT { using type = std::make_unsigned_t<T>; }; #if HAS128 template <> struct UnSignedT<int128_t> { using type = uint128_t; }; template <> struct UnSignedT<uint128_t> { using type = uint128_t; }; #endif template <typename T> using UnSigned = typename UnSignedT<T>::type; template <typename IntT> inline size_t PopCntParallel(IntT n) { // https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel using T = UnSigned<IntT>; T v = T(n); v = v - ((v >> 1) & (T)~(T)0/3); // temp v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp return size_t((T)(v * ((T)~(T)0/255)) >> (sizeof(T) - 1) * 8); // count } template <typename IntT> inline size_t PopCntKernighan(IntT n) { // http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan using T = UnSigned<IntT>; T v = T(n); size_t c; for (c = 0; v; ++c) v &= v - 1; // Clear the least significant bit set return c; } #if IS_MSVC // https://learn.microsoft.com/en-us/cpp/intrinsics/popcnt16-popcnt-popcnt64?view=msvc-160 #include <intrin.h> #define popcnt16 __popcnt16 #define popcnt32 __popcnt #define popcnt64 __popcnt64 #else // https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html #define popcnt16 __builtin_popcount #define popcnt32 __builtin_popcount #define popcnt64 __builtin_popcountll #endif #define popcnt128(x) (popcnt64(uint64_t(x)) + popcnt64(uint64_t(x >> 64))) template <typename IntT> inline size_t PopCntBuiltin(IntT n) { using T = UnSigned<IntT>; T v = T(n); if constexpr(sizeof(IntT) <= 2) return popcnt16(uint16_t(v)); else if constexpr(sizeof(IntT) <= 4) return popcnt32(uint32_t(v)); else if constexpr(sizeof(IntT) <= 8) return popcnt64(uint64_t(v)); else if constexpr(sizeof(IntT) <= 16) return popcnt128(uint128_t(v)); else static_assert([]{ return false; }()); } #include <random> #include <vector> #include <chrono> #include <string> #include <iostream> #include <iomanip> #include <map> inline double Time() { static auto const gtb = std::chrono::high_resolution_clock::now(); return std::chrono::duration_cast<std::chrono::duration<double>>( std::chrono::high_resolution_clock::now() - gtb).count(); } template <typename T, typename F> void Test(std::string const & name, F f) { std::mt19937_64 rng{123}; size_t constexpr bit_size = sizeof(T) * 8, ntests = 1 << 6, nnums = 1 << 14; std::vector<T> nums(nnums); for (size_t i = 0; i < nnums; ++i) nums[i] = T(rng() % ~T(0)); static std::map<size_t, size_t> times; double min_time = 1000; for (size_t i = 0; i < ntests; ++i) { double timer = Time(); size_t sum = 0; for (size_t j = 0; j < nnums; j += 4) sum += f(nums[j + 0]) + f(nums[j + 1]) + f(nums[j + 2]) + f(nums[j + 3]); auto volatile vsum = sum; min_time = std::min(min_time, (Time() - timer) / nnums); if (times.count(bit_size) && times.at(bit_size) != sum) std::cout << "Wrong bit cnt checksum!" << std::endl; times[bit_size] = sum; } std::cout << std::setw(2) << std::setfill('0') << bit_size << " bit " << name << " " << std::fixed << std::setprecision(1) << min_time * 1000000000 << " ns" << std::endl; } int main() { #define TEST(T) \ Test<T>("Builtin", PopCntBuiltin<T>); \ Test<T>("Parallel", PopCntParallel<T>); \ Test<T>("Kernighan", PopCntKernighan<T>); \ std::cout << std::endl; TEST(uint8_t); TEST(uint16_t); TEST(uint32_t); TEST(uint64_t); #if HAS128 TEST(uint128_t); #endif #undef TEST } 

1 Comment

A comment didn't fit, so: stackoverflow.com/a/75410534/7880616
0

A simple way which should work nicely for a small amount of bits it something like this (For 4 bits in this example):

(i & 1) + (i & 2)/2 + (i & 4)/4 + (i & 8)/8

Would others recommend this for a small number of bits as a simple solution?

Comments

0

Here's something that works in PHP (all PHP intergers are 32 bit signed, thus 31 bit):

function bits_population($nInteger) { $nPop=0; while($nInteger) { $nInteger^=(1<<(floor(1+log($nInteger)/log(2))-1)); $nPop++; } return $nPop; } 

Comments

0
#!/user/local/bin/perl $c=0x11BBBBAB; $count=0; $m=0x00000001; for($i=0;$i<32;$i++) { $f=$c & $m; if($f == 1) { $count++; } $c=$c >> 1; } printf("%d",$count); ive done it through a perl script. the number taken is $c=0x11BBBBAB B=3 1s A=2 1s so in total 1+1+3+3+3+2+3+3=19 

1 Comment

Is there something special about this implementation? The accepted answer is obviously much more efficient than your answer, so how is this a "best" solution (as requested in the question)?
0

Here is the sample code, which might be useful.

private static final int[] bitCountArr = new int[]{0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8}; private static final int firstByteFF = 255; public static final int getCountOfSetBits(int value){ int count = 0; for(int i=0;i<4;i++){ if(value == 0) break; count += bitCountArr[value & firstByteFF]; value >>>= 8; } return count; } 

Comments

0
def hammingWeight(n): count = 0 while n: if n&1: count += 1 n >>= 1 return count 

Comments

0

For Java, there is a java.util.BitSet. https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html

cardinality(): Returns the number of bits set to true in this BitSet.

The BitSet is memory efficient since it's stored as a Long.

Comments

0

For those who want it in C++11 for any unsigned integer type as a consexpr function (tacklelib/include/tacklelib/utility/math.hpp):

#include <stdint.h> #include <limits> #include <type_traits> const constexpr uint32_t uint32_max = (std::numeric_limits<uint32_t>::max)(); namespace detail { template <typename T> inline constexpr T _count_bits_0(const T & v) { return v - ((v >> 1) & 0x55555555); } template <typename T> inline constexpr T _count_bits_1(const T & v) { return (v & 0x33333333) + ((v >> 2) & 0x33333333); } template <typename T> inline constexpr T _count_bits_2(const T & v) { return (v + (v >> 4)) & 0x0F0F0F0F; } template <typename T> inline constexpr T _count_bits_3(const T & v) { return v + (v >> 8); } template <typename T> inline constexpr T _count_bits_4(const T & v) { return v + (v >> 16); } template <typename T> inline constexpr T _count_bits_5(const T & v) { return v & 0x0000003F; } template <typename T, bool greater_than_uint32> struct _impl { static inline constexpr T _count_bits_with_shift(const T & v) { return detail::_count_bits_5( detail::_count_bits_4( detail::_count_bits_3( detail::_count_bits_2( detail::_count_bits_1( detail::_count_bits_0(v)))))) + count_bits(v >> 32); } }; template <typename T> struct _impl<T, false> { static inline constexpr T _count_bits_with_shift(const T & v) { return 0; } }; } template <typename T> inline constexpr T count_bits(const T & v) { static_assert(std::is_integral<T>::value, "type T must be an integer"); static_assert(!std::is_signed<T>::value, "type T must be not signed"); return uint32_max >= v ? detail::_count_bits_5( detail::_count_bits_4( detail::_count_bits_3( detail::_count_bits_2( detail::_count_bits_1( detail::_count_bits_0(v)))))) : detail::_impl<T, sizeof(uint32_t) < sizeof(v)>::_count_bits_with_shift(v); } 

Plus tests in google test library:

#include <stdlib.h> #include <time.h> namespace { template <typename T> inline uint32_t _test_count_bits(const T & v) { uint32_t count = 0; T n = v; while (n > 0) { if (n % 2) { count += 1; } n /= 2; } return count; } } TEST(FunctionsTest, random_count_bits_uint32_100K) { srand(uint_t(time(NULL))); for (uint32_t i = 0; i < 100000; i++) { const uint32_t r = uint32_t(rand()) + (uint32_t(rand()) << 16); ASSERT_EQ(_test_count_bits(r), count_bits(r)); } } TEST(FunctionsTest, random_count_bits_uint64_100K) { srand(uint_t(time(NULL))); for (uint32_t i = 0; i < 100000; i++) { const uint64_t r = uint64_t(rand()) + (uint64_t(rand()) << 16) + (uint64_t(rand()) << 32) + (uint64_t(rand()) << 48); ASSERT_EQ(_test_count_bits(r), count_bits(r)); } } 

Comments

0

Convert the integer to a binary string and count the ones.

PHP solution:

substr_count(decbin($integer), '1'); 

2 Comments

Sounds like the javascript way. I would suggest using a webservice instead!
So, first do an expensive conversion and then do a set of expensive comparisons? Sounds like a very slow method.
0

A simple algorithm to count the number of set bits:

int countbits(n) { int count = 0; while(n != 0) { n = n & (n-1); count++; } return count; } 

Take the example of 11 (1011) and try manually running through the algorithm. It should help you a lot!

1 Comment

What programming language? C++?
0

Here is the functional master race recursive solution, and it is by far the purest one (and can be used with any bit length!):

template<typename T> int popcnt(T n) { if (n>0) return n&1 + popcnt(n>>1); return 0; } 

1 Comment

What programming language? C++?
0

Kotlin pre 1.4

 fun NumberOfSetBits(i: Int): Int { var i = i i -= (i ushr 1 and 0x55555555) i = (i and 0x33333333) + (i ushr 2 and 0x33333333) return (i + (i ushr 4) and 0x0F0F0F0F) * 0x01010101 ushr 24 } 

This is more or less a copy of the answer seen in the top answer.

It is with the Java fixes and is then converted using the converter in the IntelliJ IDEA Community Edition

1.4 and beyond (as of 2021-05-05 - it could change in the future).

 fun NumberOfSetBits(i: Int): Int { return i.countOneBits() } 

Under the hood it uses Integer.bitCount as seen here:

@SinceKotlin("1.4") @WasExperimental(ExperimentalStdlibApi::class) @kotlin.internal.InlineOnly public actual inline fun Int.countOneBits(): Int = Integer.bitCount(this) 

Comments

0

I'll contribute to @Arty's answer

__popcnt16()/__popcnt()/__popcnt64() MSVC's intrinsic (doc here)

popcnt instruction, as noted in "Remarks" section, is available as part of SSE4 instruction set and there is a relatively high chance of it not being available.

If you run code that uses these intrinsics on hardware that doesn't support the popcnt instruction, the results are unpredictable.

So, you need to implement a check as per "Remarks" section:

To determine hardware support for the popcnt instruction, call the __cpuid intrinsic with InfoType=0x00000001 and check bit 23 of CPUInfo[2] (ECX). This bit is 1 if the instruction is supported, and 0 otherwise.

Here's how you do it:

unsigned popcnt(const unsigned input) { struct cpuinfo_t { union { int regs[4]; struct { long eax, ebx, ecx, edx; }; }; cpuinfo_t() noexcept : regs() {} } cpuinfo; // EAX=1: Processor Info and Feature Bits __cpuid(cpuinfo.regs, 1); // ECX bit 23: popcnt if (_bittest(&cpuinfo.ecx, 23)) { return __popcnt(input); } // Choose any fallback implementation you like, there's already a ton of them unsigned num = input; num = (num & 0x55555555) + (num >> 1 & 0x55555555); num = (num & 0x33333333) + (num >> 2 & 0x33333333); num = (num & 0x0F0F0F0F) + (num >> 4 & 0x0F0F0F0F); num = (num & 0x00FF00FF) + (num >> 8 & 0x00FF00FF); num = (num & 0x0000FFFF) + (num >> 16 & 0x0000FFFF); return num; } 

Comments

0

In C# what about an one liner:

BitOperations.PopCount(Mask); 

Returns the population count (number of bits set) of a mask. Similar in behavior to the x86 instruction POPCNT. Compatible with x64! It uses an intrinsic (built-in instruction of the X86 architecture) to count the number of bits very fast in a 32 bit or 64 bit value.

NOTE: BitOperations.PopCount() is not CLS compliant. Take this under consideration.

Cheers

Comments

0

I have not seen this approach anywhere:

int nbits(unsigned char v) { return ((((v - ((v >> 1) & 0x55)) * 0x1010101) & 0x30c00c03) * 0x10040041) >> 28; } 

(spelled out in post markdown (viewable via Edit))

It works per byte, so it would have to be called four times for a 32-bit integer. It is derived from the sideways addition, but it uses two 32-bit multiplications to reduce the number of instructions to only seven.

Most current C compilers will optimize this function using SIMD (SSE2) instructions when it is clear that the number of requests is a multiple of 4, and it becomes quite competitive.
It is portable, can be defined as a macro or inline function and does not need data tables.

This approach can be extended to work on 16 bits at a time, using 64-bit multiplications. However, it fails when all 16 bits are set, returning zero, so it can be used only when the 0xFFFF input value is not present.
It is also slower due to the 64-bit operations and does not optimize well.


Turns out that Hacker's Delight (Anderson) includes a solution with the same features, using only five instructions.

static uint32_t hd8(uint8_t v) { return ((((v * 0x8040201u) >> 3u) & 0x11111111u) * 0x11111111u) >> 28u; } 

Comments

0

Counting set bits in binary representation (N):

Pseudocode -

  1. set counter = 0.

  2. repeat counting while N is not zero.

    1. check last bit.
        if last bit = 1, increment counter
    1. Discard last bit of N.

Now let's code this in C++

int countSetBits(unsigned int n){ int count = 0; while(n!=0){ count += n&1; n = n >>1; } return count; } 

Let's use this function.

int main(){ int x = 5; cout<<countSetBits(x); return 0; } 

Output: 2

Because 5 has 2 bits set in binary representation (101).

You can run the code here.

Comments

-1
// How about the following: public int CountBits(int value) { int count = 0; while (value > 0) { if (value & 1) count++; value <<= 1; } return count; } 

Comments

-1

You can do something like:

int countSetBits(int n) { n=((n&0xAAAAAAAA)>>1) + (n&0x55555555); n=((n&0xCCCCCCCC)>>2) + (n&0x33333333); n=((n&0xF0F0F0F0)>>4) + (n&0x0F0F0F0F); n=((n&0xFF00FF00)>>8) + (n&0x00FF00FF); return n; } int main() { int n=10; printf("Number of set bits: %d",countSetBits(n)); return 0; } 

See heer: http://ideone.com/JhwcX

The working can be explained as follows:

First, all the even bits are shifted towards right & added with the odd bits to count the number of bits in group of two. Then we work in group of two, then four & so on..

1 Comment

Buggy: godbolt.org/g/YuN5cA returns 131083 for n=1234667. You need to combine the two 8-bit chunks that the last step leaves, and clear the high bits. Also, it's apparently possible to be more efficient than this SWAR 1-bit, 2-bit, 4-bit sequence. I haven't grokked the magic of the top answer's bit-twiddling hack, though. stackoverflow.com/questions/109023/…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.