17

How does this function work?

A mask with the least significant n bits set to 1.

Example:

n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F

Also, what is a mask?

5
  • 4
    also what is a mask? How about Wikipedia? Commented Sep 14, 2012 at 0:34
  • 3
    The 0x2F is wrong by the way, it should be 0x3f Commented Sep 14, 2012 at 0:36
  • 1
    @chris wiki is too confusion... Commented Sep 14, 2012 at 0:39
  • Possible duplicate of Algorithm to generate bit mask Commented Jun 9, 2019 at 1:28
  • Bit masking: What is bit masking? Commented Aug 14, 2023 at 3:01

7 Answers 7

33

One common way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.

Although I haven't seen it used as often, it's marginally safer (from some perspectives, anyway) to use an unsigned type for the mask. Initialize it to -1 (which is guaranteed to set all the bits to 1), then shift it right to clear the top bits you don't want set (which is why you want it unsigned--this guarantees that the top bits will be zeros). So for example, to get an 8-bit mask with the 3 least significant bits set, you'd do something like:

uint8_t mask = -1; mask >>= 8 - 3; 

This avoids undefined behavior when you end up needing a mask that happens to be precisely the width of the type you're using (e.g., an 8-bit type where you need all 8 bits set).

A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.

Sign up to request clarification or add additional context in comments.

3 Comments

Keep in mind that going 1 << w - 1, where w is the width of the data type, to set all but one of the bits, is UB.
Exactly. Blame it on Intel, but it made it to the standard.
I think you should do something like (x << n) - 1. Arithmetic operations have higher precedence than bit-shift operations. So, x << n - 1 will give you the wrong result.
11

A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.

For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.

Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.

Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.

To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.

So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.

Positive powers of 2 are easily generated with the left shift << operator in C.

Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.

(1 << n) - 1 will produce a mask with n lowest bits set to 1.

Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.

Comments

11

For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.

Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.

This method is correct, whereas the current top answers produce undefined behavior in edge cases.

Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions: -mbmi -mbmi2

#include <inttypes.h> #include <stdio.h> uint64_t gen_mask(const uint_fast8_t msb) { const uint64_t src = (uint64_t)1 << msb; return (src - 1) ^ src; } int main() { uint_fast8_t msb; for (msb = 0; msb < 64; ++msb) { printf("%016" PRIx64 "\n", gen_mask(msb)); } return 0; } 

5 Comments

Sorry, that's a misunderstanding: I would have used the width as parameter (like the N that the OP mentioned), but since you use the index of the MSB it's actually consistent.
What does the consting do in this case?
I think what @UlrichEckhardt meant was that if msb defines the inclusive most-significant bit (which is the typical usage), then your mask is too short by 1 bit. By the inclusive definition, an msb of 1 should select bits 1 and 0, thus a mask of 0x3, but your code produces 0x1. src = 1LL << (msb + 1) is better. Or -- change your variable name from "msb" to "num_of_bits" and then you are correct.
Because the input has been shifted by one this solution works for all bits set (msb == 63), but you can no longer ask for a mask with no bits set, since msb == 0 gives you the bottom bit set.
What are some examples of edge cases?
2

First, for those who only want the code to create the mask:

uint64_t bits = 6; uint64_t mask = ((uint64_t)1 << bits) - 1; # Results in 0b111111 (or 0x03F) 

Thanks to @Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:

uint64_t bits = 6; uint64_t mask = (bits < 64) ? ((uint64_t)1 << bits) - 1 : (uint64_t)0 - 1 

For those who want to know what a mask is:

A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.

Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.

Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.

You can read more about bitwise operations in C here.

2 Comments

does this work if bits is 64?
No @Benni, it didn't. Added a condition to handle that. Getting a single implementation for both is tricky.
1

On x86-64 processors supporting BMI2 (almost every Intel and AMD CPU made after 2013) this can be done with a single instruction: BZHI. The catch is that in order to get your compiler to emit said instruction you may have to use compiler intrinsics.

For reference, I have included the assembly generated by GCC 15. On GCC you should supply the -mbmi2 flag to enable BMI2 instructions.

#include <stdint.h> #include <immintrin.h> uint64_t lsb_bzhi(uint64_t num_bits) { return _bzhi_u64((uint64_t) -1, num_bits); }; 
"lsb_bzhi": mov rax, -1 bzhi rax, rax, rdi ret 

Not only is this as fast as it gets, it also handles the edge cases of num_bits == 0 and num_bits >= 64 correctly.

The traditional solution involving bit shifting is good for portability. However, the compilers I tested (GCC 15 and Clang 20) did not identify the BZHI pattern, resulting in comparatively slower code, especially when you need to handle the num_bits >= 64 edge case.

#include <stdint.h> uint64_t lsb_shlx(uint64_t num_bits) { if (num_bits >= 64) { return (uint64_t) -1; }; return ((uint64_t) 1 << num_bits) - 1; }; 
"lsb_shlx": mov eax, 1 mov rdx, -1 shlx rax, rax, rdi sub rax, 1 cmp rdi, 64 cmovnb rax, rdx ret 

Note that even if you aren't targeting only modern x86-64 you still could use the preprocessor to conditionally call BZHI when available. On GCC this would look something like:

uint64_t lsb(uint64_t num_bits) { #ifdef __BMI2__ return lsb_bzhi(num_bits); #else return lsb_shlx(num_bits); #endif }; 

Someone else has mentioned that the BMI1 instruction BLSMSK can also be used to achieve something similar. However, this requires some additional code to handle the num_bits == 0 edge case and to prepare the instruction input, losing most of the performance benefits of BZHI.

#include <stdint.h> #include <immintrin.h> uint64_t lsb_blsmsk(uint64_t num_bits) { if (!num_bits) { return 0; }; return _blsmsk_u64((uint64_t) 1 << (num_bits - 1)); }; 
"lsb_blsmsk": xor eax, eax test rdi, rdi je .L3 sub edi, 1 mov eax, 1 shlx rax, rax, rdi blsmsk rax, rax .L3: ret 

4 Comments

Assertions about which solution is fastest or which instructions a compiler will use are fragile. A compiler may turn a compound solution like return n == 64 ? -1 : ((uint64_t) 1 << n) - 1; into a single instruction if it recognizes the opportunity. If today’s version of the compiler does not do so, tomorrow’s might. Conversely, an intrinsic technically only specifies the function to be performed, not the instruction to use, so a compiler may turn a single intrinsic into multiple instructions (I have seen this)…
You are of course right. I have revised the answer to clarify that this applies to modern GCC. I have also added a link to the compiler output in Godbolt. Playing around with different compilers and different ways of expressing the creation of a least-significant bits mask in C, nothing I tried resulted in BZHI instructions. As of today, short of writing actual x86 assembly, it seems that using compiler intrinsics is our best bet.
… (Intrinsics may have been created to facilitate access to instructions, but they nonetheless do not always provide the instructions they are named for.)
|
0

I believe your first example should be 0x3f.

0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.

The following little C program will calculate the correct mask:

#include <stdarg.h> #include <stdio.h> int mask_for_n_bits(int n) { int mask = 0; for (int i = 0; i < n; ++i) mask |= 1 << i; return mask; } int main (int argc, char const *argv[]) { printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17)); return 0; } 

1 Comment

The return type for mask_for_n_bits should be unsigned, as well as the type of mask and the update expression should use mask |= 1U << 1;
0

0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.

Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.

A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.

For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.

In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.

1 Comment

Ops. Got the wrong edit after your update, but I did rollback. Sorry!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.