Count the number of set bits in a 32-bit integer

Question

8 bits representing the number 7 look like this:

00000111

Three bits are set.

What are the algorithms to determine the number of set bits in a 32-bit integer?

What's a real-world application for this? (This isn't to be taken as a criticism--I'm just curious.) — jonmorgan
– jonmorgan, Commented Dec 10, 2010 at 20:59
Calculation of parity bit (look it up), which was used as simple error detection in communication. — Dialecticus
– Dialecticus, Commented Dec 11, 2010 at 0:28
@Dialecticus, calculating a parity bit is cheaper than calculating the Hamming weight — finnw
– finnw, Commented May 12, 2011 at 12:14
@spookyjon Let's say you have a graph represented as an adjacency matrix, which is essentially a bit set. If you want to calculate the number of edges of a vertex, it boils down to calculating the Hamming weight of one row in the bit set. — fuz
– fuz, Commented Oct 10, 2011 at 16:02

3 revs, 3 users 96% · Accepted Answer · 2024-06-04 16:51:23Z

Here is a portable module ( ANSI-C ) which can benchmark each of your algorithms on any architecture.

Your CPU has 9 bit bytes? No problem :-) At the moment it implements 2 algorithms, the K&R algorithm and a byte wise lookup table. The lookup table is on average 3 times faster than the K&R algorithm. If someone can figure a way to make the "Hacker's Delight" algorithm portable feel free to add it in.

#ifndef _BITCOUNT_H_ #define _BITCOUNT_H_ /* Return the Hamming Weight of val, i.e. the number of 'on' bits. */ int bitcount( unsigned int ); /* List of available bitcount algorithms. * onTheFly: Calculate the bitcount on demand. * * lookupTable: Uses a small lookup table to determine the bitcount. This * method is on average 3 times as fast as onTheFly, but incurs a small * upfront cost to initialize the lookup table on the first call. * * strategyCount is just a placeholder. */ enum strategy { onTheFly, lookupTable, strategyCount }; /* String represenations of the algorithm names */ extern const char *strategyNames[]; /* Choose which bitcount algorithm to use. */ void setStrategy( enum strategy ); #endif

.

#include <limits.h> #include "bitcount.h" /* The number of entries needed in the table is equal to the number of * unique values a char can represent which is always UCHAR_MAX + 1 */ static unsigned char _bitCountTable[UCHAR_MAX + 1]; static unsigned int _lookupTableInitialized = 0; static int _defaultBitCount( unsigned int val ) { int count; /* Starting with: * 1100 - 1 == 1011, 1100 & 1011 == 1000 * 1000 - 1 == 0111, 1000 & 0111 == 0000 */ for ( count = 0; val; ++count ) val &= val - 1; return count; } /* Looks up each byte of the integer in a lookup table. * * The first time the function is called it initializes the lookup table. */ static int _tableBitCount( unsigned int val ) { int bCount = 0; if ( !_lookupTableInitialized ) { unsigned int i; for ( i = 0; i != UCHAR_MAX + 1; ++i ) _bitCountTable[i] = ( unsigned char )_defaultBitCount( i ); _lookupTableInitialized = 1; } for ( ; val; val >>= CHAR_BIT ) bCount += _bitCountTable[val & UCHAR_MAX]; return bCount; } static int ( *_bitcount ) ( unsigned int ) = _defaultBitCount; const char *strategyNames[] = { "onTheFly", "lookupTable" }; void setStrategy( enum strategy s ) { switch ( s ) { case onTheFly: _bitcount = _defaultBitCount; break; case lookupTable: _bitcount = _tableBitCount; break; case strategyCount: break; } } /* Just a forwarding function which will call whichever version * of the algorithm has been selected by the client */ int bitcount( unsigned int val ) { return _bitcount( val ); } #ifdef _BITCOUNT_EXE_ #include <stdio.h> #include <stdlib.h> #include <time.h> /* Use the same sequence of pseudo random numbers to benchmark each * Hamming Weight algorithm. */ void benchmark( int reps ) { clock_t start, stop; int i, j; static const int iterations = 1000000; for ( j = 0; j != strategyCount; ++j ) { setStrategy( j ); srand( 257 ); start = clock( ); for ( i = 0; i != reps * iterations; ++i ) bitcount( rand( ) ); stop = clock( ); printf ( "\n\t%d pseudo-random integers using %s: %f seconds\n\n", reps * iterations, strategyNames[j], ( double )( stop - start ) / CLOCKS_PER_SEC ); } } int main( void ) { int option; while ( 1 ) { printf( "Menu Options\n" "\t1.\tPrint the Hamming Weight of an Integer\n" "\t2.\tBenchmark Hamming Weight implementations\n" "\t3.\tExit ( or cntl-d )\n\n\t" ); if ( scanf( "%d", &option ) == EOF ) break; switch ( option ) { case 1: printf( "Please enter the integer: " ); if ( scanf( "%d", &option ) != EOF ) printf ( "The Hamming Weight of %d ( 0x%X ) is %d\n\n", option, option, bitcount( option ) ); break; case 2: printf ( "Please select number of reps ( in millions ): " ); if ( scanf( "%d", &option ) != EOF ) benchmark( option ); break; case 3: goto EXIT; break; default: printf( "Invalid option\n" ); } } EXIT: printf( "\n" ); return 0; } #endif

I like very much your plug-in, polymorphic approach, as well as the switch to build as a reusable library or stand-alone, test executable. Very well thought =)

4 revs, 2 users 93% · Accepted Answer · 2012-03-27 23:12:54Z

32-bit or not ? I just came with this method in Java after reading "cracking the coding interview" 4th edition exercice 5.5 ( chap 5: Bit Manipulation). If the least significant bit is 1 increment count, then right-shift the integer.

public static int bitCount( int n){ int count = 0; for (int i=n; i!=0; i = i >> 1){ count += i & 1; } return count; }

I think this one is more intuitive than the solutions with constant 0x33333333 no matter how fast they are. It depends on your definition of "best algorithm" .

after reading others, it's similar to paxdiablo's answer . I agree on "readability over cleverness any time".

2 revs, 2 users 90% · Accepted Answer · 2022-09-10 01:45:10Z

Naive Solution

Time Complexity is O(no. of bits in n)

int countSet(unsigned int n) { int res=0; while(n!=0){ res += (n&1); n >>= 1; // logical right shift, like C unsigned or Java >>> } return res; }

Brian Kerningam's algorithm

Time Complexity is O(no of set bits in n)

int countSet(unsigned int n) { int res=0; while(n != 0) { n = (n & (n-1)); res++; } return res; }

Lookup table method for 32-bit number- In this method we break the 32-bit number into chunks of four, 8-bit numbers

Time Complexity is O(1)

static unsigned char table[256]; /* the table size is 256, the number of values i&0xFF (8 bits) can have */ void initialize() //holds the number of set bits from 0 to 255 { table[0]=0; for(unsigned int i=1;i<256;i++) table[i]=(i&1)+table[i>>1]; } int countSet(unsigned int n) { // 0xff is hexadecimal representation of 8 set bits. int res=table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; n=n>>8; res=res+ table[n & 0xff]; return res; }

3 revs, 2 users 65% · Accepted Answer · 2022-08-26 21:22:45Z

Python solution:

def hammingWeight(n: int) -> int: sums = 0 while (n!=0): sums+=1 n = n &(n-1) return sums

In the binary representation, the least significant 1-bit in n always corresponds to a 0-bit in n - 1. Therefore, anding the two numbers n and n - 1 always flips the least significant 1-bit in n to 0, and keeps all other bits the same.

SteveR · Accepted Answer · 2012-04-26 02:25:11Z

2

Personally I use this :

 public static int myBitCount(long L){ int count = 0; while (L != 0) { count++; L ^= L & -L; } return count; }

answered Apr 26, 2012 at 2:25

community wiki

SteveR

1 Comment

David Over a year ago

I like this method. It's not that hard to understand, there is only one like that really needs any comment. And when you understand that one line it is like a little gem that you have been given. (What was in my coffee this morning?) I have used this without the loop to check if only one single flag bit is set - i.e. process one way if only one flag bit is set, some other (more tedious) way if multiple flags are set. (edit: formatting)

3 revs · Accepted Answer · 2022-08-30 19:58:44Z

int countBits(int x) { int n = 0; if (x) do n++; while(x=x&(x-1)); return n; }

Or also:

int countBits(int x) { return (x)? 1+countBits(x&(x-1)): 0; }

7 1/2 years after my original answer, @PeterMortensen questioned if this was even valid C syntax. I posted a link to an online compiler showing that it is in fact perfectly valid syntax (code below).

#include <stdio.h> int countBits(int x) { int n = 0; if (x) do n++; /* Totally Normal Valid code. */ while(x=x&(x-1)); /* Nothing to see here. */ return n; } int main(void) { printf("%d\n", countBits(25)); return 0; }

Output:

If you want to re-write it for clarity, it would look like:

if (x) { do { n++; } while(x=x&(x-1)); }

But that seems excessive to my eye.

However, I've also realized the function can be made shorter, but perhaps more cryptic, written as:

int countBits(int x) { int n = 0; while (x) x=(n++,x&(x-1)); return n; }

In which language is if (x) do n++; accepted? Does it actually compile?

2 revs · Accepted Answer · 2013-11-27 14:29:51Z

Here is a solution that has not been mentioned so far, using bitfields. The following program counts the set bits in an array of 100000000 16-bit integers using 4 different methods. Timing results are given in parentheses (on MacOSX, with gcc -O3):

#include <stdio.h> #include <stdlib.h> #define LENGTH 100000000 typedef struct { unsigned char bit0 : 1; unsigned char bit1 : 1; unsigned char bit2 : 1; unsigned char bit3 : 1; unsigned char bit4 : 1; unsigned char bit5 : 1; unsigned char bit6 : 1; unsigned char bit7 : 1; } bits; unsigned char sum_bits(const unsigned char x) { const bits *b = (const bits*) &x; return b->bit0 + b->bit1 + b->bit2 + b->bit3 \ + b->bit4 + b->bit5 + b->bit6 + b->bit7; } int NumberOfSetBits(int i) { i = i - ((i >> 1) & 0x55555555); i = (i & 0x33333333) + ((i >> 2) & 0x33333333); return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24; } #define out(s) \ printf("bits set: %lu\nbits counted: %lu\n", 8*LENGTH*sizeof(short)*3/4, s); int main(int argc, char **argv) { unsigned long i, s; unsigned short *x = malloc(LENGTH*sizeof(short)); unsigned char lut[65536], *p; unsigned short *ps; int *pi; /* set 3/4 of the bits */ for (i=0; i<LENGTH; ++i) x[i] = 0xFFF0; /* sum_bits (1.772s) */ for (i=LENGTH*sizeof(short), p=(unsigned char*) x, s=0; i--; s+=sum_bits(*p++)); out(s); /* NumberOfSetBits (0.404s) */ for (i=LENGTH*sizeof(short)/sizeof(int), pi=(int*)x, s=0; i--; s+=NumberOfSetBits(*pi++)); out(s); /* populate lookup table */ for (i=0, p=(unsigned char*) &i; i<sizeof(lut); ++i) lut[i] = sum_bits(p[0]) + sum_bits(p[1]); /* 256-bytes lookup table (0.317s) */ for (i=LENGTH*sizeof(short), p=(unsigned char*) x, s=0; i--; s+=lut[*p++]); out(s); /* 65536-bytes lookup table (0.250s) */ for (i=LENGTH, ps=x, s=0; i--; s+=lut[*ps++]); out(s); free(x); return 0; }

While the bitfield version is very readable, the timing results show that it is over 4x slower than NumberOfSetBits(). The lookup-table based implementations are still quite a bit faster, in particular with a 65 kB table.

Note that this is a microbenchmark and should be taken with a grain of salt. For example, setting up the lookup table just before using it is priming the cache in a way that may be rare in real code.
Re "4x slower ... quite a bit faster": That needs to be qualified. For example, is it true for an AVR microcontroller? What kind of system of presumed? An AMD Ryzen 3950X?

Mufaddal Kagda · Accepted Answer · 2014-06-07 16:41:10Z

int bitcount(unsigned int n) { int count=0; while(n) { count += n & 0x1u; n >>= 1; } return count; }

Iterated 'count' runs in time proportional to the total number of bits. It simply loops through all the bits, terminating slightly earlier because of the while condition. Useful, if 1'S or the set bits are sparse and among the least significant bits.

Jonatan Kaźmierczak · Accepted Answer · 2016-10-04 21:05:15Z

1

In Java 8 or 9 just invoke Integer.bitCount .

answered Oct 4, 2016 at 21:05

community wiki

Jonatan Kaźmierczak

1 Comment

greybeard Over a year ago

How does this answer What are the algorithms to [popcount]?

rashedcs · Accepted Answer · 2017-06-15 12:01:54Z

You can use built in function named __builtin_popcount(). There is no__builtin_popcount in C++ but it is a built in function of GCC compiler. This function return the number of set bit in an integer.

int __builtin_popcount (unsigned int x);

Reference : Bit Twiddling Hacks

How does this answer What are the algorithms to [popcount]?

Boštjan Mejak · Accepted Answer · 2020-09-14 17:38:11Z

From Python 3.10 onwards, you will be able to use the int.bit_count() function, but for the time being, you can define this function yourself.

def bit_count(integer): return bin(integer).count("1")

2 revs, 2 users 75% · Accepted Answer · 2022-08-26 21:03:16Z

1

Another Hamming weight algorithm if you're on a BMI2 capable CPU:

the_weight = __tzcnt_u64(~_pext_u64(data[i], data[i]));

edited Aug 26, 2022 at 21:03

community wiki

2 revs, 2 users 75%
Anders Cedronius

4 Comments

greybeard Over a year ago

(cram set bits to the low end, invert, count unset bits from the low end)

Peter Cordes Over a year ago

Fun, but of no practical value. All BMI2 CPUs have popcnt. pext same,same to pack the bits could be an interesting building-block for something else, but tzcnt and pext both run on the same port as popcnt on Intel CPUs, and pext is very slow on AMD. (agner.org/optimize). You can sort of emulate pext x,x with (1ULL << popcnt(x)) - 1, except for the x==0 case. x86 shifts can't shift out all the bits, because they mask the shift count, and you have to watch out for C undefined behaviour with out of range counts.

Peter Mortensen Over a year ago

What compiler was it tried with? It seems to be compiler-specific (compiler extension __tzcnt_u64? Library function '__tzcnt_u64')? Can you add some context to the answer?

Anders Cedronius Over a year ago

Sure. for MacOS and Linux #include <immintrin.h> for Visual Studio use #include <intrin.h>. Modern compilers should not need any specific linker flags. However let me know if you're stuck and I'll help you out.

5 revs, 2 users 95% · Accepted Answer · 2022-08-27 05:42:22Z

I am providing one more unmentioned algorithm, called Parallel, taken from here. The nice point about it that it is generic, meaning that the code is the same for bit sizes 8, 16, 32, 64, and 128.

I checked the correctness of its values and timings on an amount of 2^26 numbers for bits sizes 8, 16, 32, and 64. See the timings below.

This algorithm is a first code snippet. The other two are mentioned here just for reference, because I tested and compared to them.

Algorithms are coded in C++, to be generic, but it can be easily adopted to old C.

#include <type_traits> #include <cstdint> template <typename IntT> inline size_t PopCntParallel(IntT n) { // https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel using T = std::make_unsigned_t<IntT>; T v = T(n); v = v - ((v >> 1) & (T)~(T)0/3); // temp v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp return size_t((T)(v * ((T)~(T)0/255)) >> (sizeof(T) - 1) * 8); // count }

Below are two algorithms that I compared with. One is the Kernighan simple method with a loop, taken from here.

template <typename IntT> inline size_t PopCntKernighan(IntT n) { // http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan using T = std::make_unsigned_t<IntT>; T v = T(n); size_t c; for (c = 0; v; ++c) v &= v - 1; // Clear the least significant bit set return c; }

Another one is using built-in __popcnt16()/__popcnt()/__popcnt64() MSVC's intrinsic (doc here). Or __builtin_popcount of CLang/GCC (doc here). This intrinsic should provide a very optimized version, possibly hardware:

#ifdef _MSC_VER // https://learn.microsoft.com/en-us/cpp/intrinsics/popcnt16-popcnt-popcnt64?view=msvc-160 #include <intrin.h> #define popcnt16 __popcnt16 #define popcnt32 __popcnt #define popcnt64 __popcnt64 #else // https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html #define popcnt16 __builtin_popcount #define popcnt32 __builtin_popcount #define popcnt64 __builtin_popcountll #endif template <typename IntT> inline size_t PopCntBuiltin(IntT n) { using T = std::make_unsigned_t<IntT>; T v = T(n); if constexpr(sizeof(IntT) <= 2) return popcnt16(uint16_t(v)); else if constexpr(sizeof(IntT) <= 4) return popcnt32(uint32_t(v)); else if constexpr(sizeof(IntT) <= 8) return popcnt64(uint64_t(v)); else static_assert([]{ return false; }()); }

Below are the timings, in nanoseconds per one number. All timings are done for 2^26 random numbers. Timings are compared for all three algorithms and all bit sizes among 8, 16, 32, and 64. In sum, all tests took 16 seconds on my machine. The high-resolution clock was used.

08 bit Builtin 8.2 ns 08 bit Parallel 8.2 ns 08 bit Kernighan 26.7 ns 16 bit Builtin 7.7 ns 16 bit Parallel 7.7 ns 16 bit Kernighan 39.7 ns 32 bit Builtin 7.0 ns 32 bit Parallel 7.0 ns 32 bit Kernighan 47.9 ns 64 bit Builtin 7.5 ns 64 bit Parallel 7.5 ns 64 bit Kernighan 59.4 ns 128 bit Builtin 7.8 ns 128 bit Parallel 13.8 ns 128 bit Kernighan 127.6 ns

As one can see, the provided Parallel algorithm (first among three) is as good as MSVC's/CLang's intrinsic.

For reference, below is full code that I used to test speed/time/correctness of all functions.

As a bonus this code (unlike short code snippets above) also tests 128 bit size, but only under CLang/GCC (not MSVC), as they have unsigned __int128.

Try it online!

#include <type_traits> #include <cstdint> using std::size_t; #if defined(_MSC_VER) && !defined(__clang__) #define IS_MSVC 1 #else #define IS_MSVC 0 #endif #if IS_MSVC #define HAS128 false #else using int128_t = __int128; using uint128_t = unsigned __int128; #define HAS128 true #endif template <typename T> struct UnSignedT { using type = std::make_unsigned_t<T>; }; #if HAS128 template <> struct UnSignedT<int128_t> { using type = uint128_t; }; template <> struct UnSignedT<uint128_t> { using type = uint128_t; }; #endif template <typename T> using UnSigned = typename UnSignedT<T>::type; template <typename IntT> inline size_t PopCntParallel(IntT n) { // https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel using T = UnSigned<IntT>; T v = T(n); v = v - ((v >> 1) & (T)~(T)0/3); // temp v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp return size_t((T)(v * ((T)~(T)0/255)) >> (sizeof(T) - 1) * 8); // count } template <typename IntT> inline size_t PopCntKernighan(IntT n) { // http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan using T = UnSigned<IntT>; T v = T(n); size_t c; for (c = 0; v; ++c) v &= v - 1; // Clear the least significant bit set return c; } #if IS_MSVC // https://learn.microsoft.com/en-us/cpp/intrinsics/popcnt16-popcnt-popcnt64?view=msvc-160 #include <intrin.h> #define popcnt16 __popcnt16 #define popcnt32 __popcnt #define popcnt64 __popcnt64 #else // https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html #define popcnt16 __builtin_popcount #define popcnt32 __builtin_popcount #define popcnt64 __builtin_popcountll #endif #define popcnt128(x) (popcnt64(uint64_t(x)) + popcnt64(uint64_t(x >> 64))) template <typename IntT> inline size_t PopCntBuiltin(IntT n) { using T = UnSigned<IntT>; T v = T(n); if constexpr(sizeof(IntT) <= 2) return popcnt16(uint16_t(v)); else if constexpr(sizeof(IntT) <= 4) return popcnt32(uint32_t(v)); else if constexpr(sizeof(IntT) <= 8) return popcnt64(uint64_t(v)); else if constexpr(sizeof(IntT) <= 16) return popcnt128(uint128_t(v)); else static_assert([]{ return false; }()); } #include <random> #include <vector> #include <chrono> #include <string> #include <iostream> #include <iomanip> #include <map> inline double Time() { static auto const gtb = std::chrono::high_resolution_clock::now(); return std::chrono::duration_cast<std::chrono::duration<double>>( std::chrono::high_resolution_clock::now() - gtb).count(); } template <typename T, typename F> void Test(std::string const & name, F f) { std::mt19937_64 rng{123}; size_t constexpr bit_size = sizeof(T) * 8, ntests = 1 << 6, nnums = 1 << 14; std::vector<T> nums(nnums); for (size_t i = 0; i < nnums; ++i) nums[i] = T(rng() % ~T(0)); static std::map<size_t, size_t> times; double min_time = 1000; for (size_t i = 0; i < ntests; ++i) { double timer = Time(); size_t sum = 0; for (size_t j = 0; j < nnums; j += 4) sum += f(nums[j + 0]) + f(nums[j + 1]) + f(nums[j + 2]) + f(nums[j + 3]); auto volatile vsum = sum; min_time = std::min(min_time, (Time() - timer) / nnums); if (times.count(bit_size) && times.at(bit_size) != sum) std::cout << "Wrong bit cnt checksum!" << std::endl; times[bit_size] = sum; } std::cout << std::setw(2) << std::setfill('0') << bit_size << " bit " << name << " " << std::fixed << std::setprecision(1) << min_time * 1000000000 << " ns" << std::endl; } int main() { #define TEST(T) \ Test<T>("Builtin", PopCntBuiltin<T>); \ Test<T>("Parallel", PopCntParallel<T>); \ Test<T>("Kernighan", PopCntKernighan<T>); \ std::cout << std::endl; TEST(uint8_t); TEST(uint16_t); TEST(uint32_t); TEST(uint64_t); #if HAS128 TEST(uint128_t); #endif #undef TEST }

A comment didn't fit, so: stackoverflow.com/a/75410534/7880616

Matthew Mitchell · Accepted Answer · 2010-06-01 21:54:32Z

A simple way which should work nicely for a small amount of bits it something like this (For 4 bits in this example):

(i & 1) + (i & 2)/2 + (i & 4)/4 + (i & 8)/8

Would others recommend this for a small number of bits as a simple solution?

oxygen · Accepted Answer · 2012-02-24 14:16:47Z

Here's something that works in PHP (all PHP intergers are 32 bit signed, thus 31 bit):

function bits_population($nInteger) { $nPop=0; while($nInteger) { $nInteger^=(1<<(floor(1+log($nInteger)/log(2))-1)); $nPop++; } return $nPop; }

dhpant28 · Accepted Answer · 2012-06-06 20:04:48Z

#!/user/local/bin/perl $c=0x11BBBBAB; $count=0; $m=0x00000001; for($i=0;$i<32;$i++) { $f=$c & $m; if($f == 1) { $count++; } $c=$c >> 1; } printf("%d",$count); ive done it through a perl script. the number taken is $c=0x11BBBBAB B=3 1s A=2 1s so in total 1+1+3+3+3+2+3+3=19

Is there something special about this implementation? The accepted answer is obviously much more efficient than your answer, so how is this a "best" solution (as requested in the question)?

3 revs, 2 users 95% · Accepted Answer · 2012-10-17 08:39:22Z

Here is the sample code, which might be useful.

private static final int[] bitCountArr = new int[]{0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8}; private static final int firstByteFF = 255; public static final int getCountOfSetBits(int value){ int count = 0; for(int i=0;i<4;i++){ if(value == 0) break; count += bitCountArr[value & firstByteFF]; value >>>= 8; } return count; }

Shyambeer Singh · Accepted Answer · 2020-03-22 07:16:21Z

0

def hammingWeight(n): count = 0 while n: if n&1: count += 1 n >>= 1 return count

answered Mar 22, 2020 at 7:16

community wiki

Shyambeer Singh

Comments

Steven Chou · Accepted Answer · 2020-12-03 06:44:45Z

For Java, there is a java.util.BitSet. https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html

cardinality(): Returns the number of bits set to true in this BitSet.

The BitSet is memory efficient since it's stored as a Long.

2 revs, 2 users 97% · Accepted Answer · 2022-08-26 20:21:13Z

For those who want it in C++11 for any unsigned integer type as a consexpr function (tacklelib/include/tacklelib/utility/math.hpp):

#include <stdint.h> #include <limits> #include <type_traits> const constexpr uint32_t uint32_max = (std::numeric_limits<uint32_t>::max)(); namespace detail { template <typename T> inline constexpr T _count_bits_0(const T & v) { return v - ((v >> 1) & 0x55555555); } template <typename T> inline constexpr T _count_bits_1(const T & v) { return (v & 0x33333333) + ((v >> 2) & 0x33333333); } template <typename T> inline constexpr T _count_bits_2(const T & v) { return (v + (v >> 4)) & 0x0F0F0F0F; } template <typename T> inline constexpr T _count_bits_3(const T & v) { return v + (v >> 8); } template <typename T> inline constexpr T _count_bits_4(const T & v) { return v + (v >> 16); } template <typename T> inline constexpr T _count_bits_5(const T & v) { return v & 0x0000003F; } template <typename T, bool greater_than_uint32> struct _impl { static inline constexpr T _count_bits_with_shift(const T & v) { return detail::_count_bits_5( detail::_count_bits_4( detail::_count_bits_3( detail::_count_bits_2( detail::_count_bits_1( detail::_count_bits_0(v)))))) + count_bits(v >> 32); } }; template <typename T> struct _impl<T, false> { static inline constexpr T _count_bits_with_shift(const T & v) { return 0; } }; } template <typename T> inline constexpr T count_bits(const T & v) { static_assert(std::is_integral<T>::value, "type T must be an integer"); static_assert(!std::is_signed<T>::value, "type T must be not signed"); return uint32_max >= v ? detail::_count_bits_5( detail::_count_bits_4( detail::_count_bits_3( detail::_count_bits_2( detail::_count_bits_1( detail::_count_bits_0(v)))))) : detail::_impl<T, sizeof(uint32_t) < sizeof(v)>::_count_bits_with_shift(v); }

Plus tests in google test library:

#include <stdlib.h> #include <time.h> namespace { template <typename T> inline uint32_t _test_count_bits(const T & v) { uint32_t count = 0; T n = v; while (n > 0) { if (n % 2) { count += 1; } n /= 2; } return count; } } TEST(FunctionsTest, random_count_bits_uint32_100K) { srand(uint_t(time(NULL))); for (uint32_t i = 0; i < 100000; i++) { const uint32_t r = uint32_t(rand()) + (uint32_t(rand()) << 16); ASSERT_EQ(_test_count_bits(r), count_bits(r)); } } TEST(FunctionsTest, random_count_bits_uint64_100K) { srand(uint_t(time(NULL))); for (uint32_t i = 0; i < 100000; i++) { const uint64_t r = uint64_t(rand()) + (uint64_t(rand()) << 16) + (uint64_t(rand()) << 32) + (uint64_t(rand()) << 48); ASSERT_EQ(_test_count_bits(r), count_bits(r)); } }

3 revs, 3 users 50% · Accepted Answer · 2022-08-26 20:52:15Z

0

Convert the integer to a binary string and count the ones.

PHP solution:

substr_count(decbin($integer), '1');

edited Aug 26, 2022 at 20:52

community wiki

3 revs, 3 users 50%
KeineKaefer

2 Comments

gexicide Over a year ago

Sounds like the javascript way. I would suggest using a webservice instead!

Clearer Over a year ago

So, first do an expensive conversion and then do a set of expensive comparisons? Sounds like a very slow method.

2 revs, 2 users 65% · Accepted Answer · 2022-08-26 21:09:43Z

A simple algorithm to count the number of set bits:

int countbits(n) { int count = 0; while(n != 0) { n = n & (n-1); count++; } return count; }

Take the example of 11 (1011) and try manually running through the algorithm. It should help you a lot!

2 revs, 2 users 76% · Accepted Answer · 2022-08-26 21:11:50Z

Here is the functional master race recursive solution, and it is by far the purest one (and can be used with any bit length!):

template<typename T> int popcnt(T n) { if (n>0) return n&1 + popcnt(n>>1); return 0; }

2 revs, 2 users 79% · Accepted Answer · 2022-08-26 21:16:59Z

Kotlin pre 1.4

 fun NumberOfSetBits(i: Int): Int { var i = i i -= (i ushr 1 and 0x55555555) i = (i and 0x33333333) + (i ushr 2 and 0x33333333) return (i + (i ushr 4) and 0x0F0F0F0F) * 0x01010101 ushr 24 }

This is more or less a copy of the answer seen in the top answer.

It is with the Java fixes and is then converted using the converter in the IntelliJ IDEA Community Edition

1.4 and beyond (as of 2021-05-05 - it could change in the future).

 fun NumberOfSetBits(i: Int): Int { return i.countOneBits() }

Under the hood it uses Integer.bitCount as seen here:

@SinceKotlin("1.4") @WasExperimental(ExperimentalStdlibApi::class) @kotlin.internal.InlineOnly public actual inline fun Int.countOneBits(): Int = Integer.bitCount(this)

2 revs · Accepted Answer · 2023-02-10 11:41:31Z

I'll contribute to @Arty's answer

__popcnt16()/__popcnt()/__popcnt64() MSVC's intrinsic (doc here)

popcnt instruction, as noted in "Remarks" section, is available as part of SSE4 instruction set and there is a relatively high chance of it not being available.

If you run code that uses these intrinsics on hardware that doesn't support the popcnt instruction, the results are unpredictable.

So, you need to implement a check as per "Remarks" section:

To determine hardware support for the popcnt instruction, call the __cpuid intrinsic with InfoType=0x00000001 and check bit 23 of CPUInfo[2] (ECX). This bit is 1 if the instruction is supported, and 0 otherwise.

Here's how you do it:

unsigned popcnt(const unsigned input) { struct cpuinfo_t { union { int regs[4]; struct { long eax, ebx, ecx, edx; }; }; cpuinfo_t() noexcept : regs() {} } cpuinfo; // EAX=1: Processor Info and Feature Bits __cpuid(cpuinfo.regs, 1); // ECX bit 23: popcnt if (_bittest(&cpuinfo.ecx, 23)) { return __popcnt(input); } // Choose any fallback implementation you like, there's already a ton of them unsigned num = input; num = (num & 0x55555555) + (num >> 1 & 0x55555555); num = (num & 0x33333333) + (num >> 2 & 0x33333333); num = (num & 0x0F0F0F0F) + (num >> 4 & 0x0F0F0F0F); num = (num & 0x00FF00FF) + (num >> 8 & 0x00FF00FF); num = (num & 0x0000FFFF) + (num >> 16 & 0x0000FFFF); return num; }

2 revs · Accepted Answer · 2023-09-07 12:43:18Z

In C# what about an one liner:

BitOperations.PopCount(Mask);

Returns the population count (number of bits set) of a mask. Similar in behavior to the x86 instruction POPCNT. Compatible with x64! It uses an intrinsic (built-in instruction of the X86 architecture) to count the number of bits very fast in a 32 bit or 64 bit value.

NOTE: BitOperations.PopCount() is not CLS compliant. Take this under consideration.

Cheers

5 revs, 3 users 43% · Accepted Answer · 2024-06-04 08:12:03Z

I have not seen this approach anywhere:

int nbits(unsigned char v) { return ((((v - ((v >> 1) & 0x55)) * 0x1010101) & 0x30c00c03) * 0x10040041) >> 28; }

(spelled out in post markdown (viewable via Edit))

It works per byte, so it would have to be called four times for a 32-bit integer. It is derived from the sideways addition, but it uses two 32-bit multiplications to reduce the number of instructions to only seven.

Most current C compilers will optimize this function using SIMD (SSE2) instructions when it is clear that the number of requests is a multiple of 4, and it becomes quite competitive.
It is portable, can be defined as a macro or inline function and does not need data tables.

This approach can be extended to work on 16 bits at a time, using 64-bit multiplications. However, it fails when all 16 bits are set, returning zero, so it can be used only when the 0xFFFF input value is not present.
It is also slower due to the 64-bit operations and does not optimize well.

Turns out that Hacker's Delight (Anderson) includes a solution with the same features, using only five instructions.

static uint32_t hd8(uint8_t v) { return ((((v * 0x8040201u) >> 3u) & 0x11111111u) * 0x11111111u) >> 28u; }

5 revs, 3 users 63% · Accepted Answer · 2024-06-04 16:36:53Z

Counting set bits in binary representation (N):

Pseudocode -

set counter = 0.
repeat counting while N is not zero.
1. check last bit.
  if last bit = 1, increment counter
1. Discard last bit of N.

Now let's code this in C++

int countSetBits(unsigned int n){ int count = 0; while(n!=0){ count += n&1; n = n >>1; } return count; }

Let's use this function.

int main(){ int x = 5; cout<<countSetBits(x); return 0; }

Output: 2

Because 5 has 2 bits set in binary representation (101).

You can run the code here.

2 revs · Accepted Answer · 2012-04-06 22:22:34Z

// How about the following: public int CountBits(int value) { int count = 0; while (value > 0) { if (value & 1) count++; value <<= 1; } return count; }

Green goblin · Accepted Answer · 2012-06-21 13:34:47Z

You can do something like:

int countSetBits(int n) { n=((n&0xAAAAAAAA)>>1) + (n&0x55555555); n=((n&0xCCCCCCCC)>>2) + (n&0x33333333); n=((n&0xF0F0F0F0)>>4) + (n&0x0F0F0F0F); n=((n&0xFF00FF00)>>8) + (n&0x00FF00FF); return n; } int main() { int n=10; printf("Number of set bits: %d",countSetBits(n)); return 0; }

See heer: http://ideone.com/JhwcX

The working can be explained as follows:

First, all the even bits are shifted towards right & added with the odd bits to count the number of bits in group of two. Then we work in group of two, then four & so on..

Buggy: godbolt.org/g/YuN5cA returns 131083 for n=1234667. You need to combine the two 8-bit chunks that the last step leaves, and clear the high bits. Also, it's apparently possible to be more efficient than this SWAR 1-bit, 2-bit, 4-bit sequence. I haven't grokked the magic of the top answer's bit-twiddling hack, though. stackoverflow.com/questions/109023/…

Collectives™ on Stack Overflow

Count the number of set bits in a 32-bit integer

66 Answers 66

1 Comment

2 Comments

Comments

Comments

1 Comment

Output:

2 Comments

2 Comments

Comments

1 Comment

1 Comment

Comments

4 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Comments

2 Comments

1 Comment

1 Comment

Comments

Comments

Comments

Comments

Counting set bits in binary representation (N):

Pseudocode -

Now let's code this in C++

Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

66 Answers 66

1 Comment

2 Comments

Comments

Comments

1 Comment

Output:

2 Comments

2 Comments

Comments

1 Comment

1 Comment

Comments

4 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Comments

2 Comments

1 Comment

1 Comment

Comments

Comments

Comments

Comments

Counting set bits in binary representation (N):

Pseudocode -

Now let's code this in C++

Comments

Comments

1 Comment

Linked

Related