8

I had a interview today where they asked me to write two "C" functions, one to to extract a single bit and other to extract a range of bits from a character. I took a while and came up with these methods.

int extractBit(char byte, int pos) { assert( (pos >= 0) && (pos < 8) ); return ( ( byte & (1<<pos) ) >> pos); } char extractBitRange(char byte, int startingPos, int offset) { assert( ( (startingPos + offset) >= 0) && ( (startingPos + offset) < 8) ); return ( byte >> startingPos ) & ~(0xff << (offset + 1)); } 

But the interviewer kept asking me if I could speed up the code further (in terms of cpu cycles) and if there is any scope of optimization that I could do to achieve it. I was clearly out of sorts and I am curious to know how would you do this?

5
  • 1
    Using C++ TMP would give an incredible run-time speedup. :) Commented Oct 4, 2009 at 12:52
  • I don't think templates would add anything. A compiler should be able to optimize the hell out of these functions if they are called with constants... Commented Oct 4, 2009 at 13:17
  • To avoid problems with shifting and logical operations on signed values, I'd make all parameters to the functions unsigned. As a plus, if they're unsigned you don't need to check for >= 0. Commented Oct 4, 2009 at 13:26
  • 4
    @rajachan: note that the interviewer was probably more interested in how you reacted to the unknown and being challenged and pushed for more information than he was in the answer to the specific question. How do you react under pressure? How well do you think on your feet? That's what you need to work on - much more than the answer to this, or any other, bit-twiddling question. Commented Oct 4, 2009 at 20:03
  • @sth: "TMP" stands for template-meta programming. This computes the value at compile-time and simply puts the result into the code. I'm not aware of many other optimization techniques that yield exactly zero run-time. :) Commented Oct 4, 2009 at 22:57

6 Answers 6

18

In extractBit, if you shift first, you can mask with 1 instead of (1<<pos). Considering that pos is an argument of the function, that saves a computation.

return (byte >> pos) & 1;

In the second function, I would assert that startingPos and offset are both positive instead of asserting that their sum is positive, it makes more sense that way.

Sign up to request clarification or add additional context in comments.

2 Comments

I'd just make them unsigned ints.
Yup. For extractBit first thought was a lookup table, but this will probably be more efficient on modern CPUs. You could also declare the function as inline in C++ or C99, which will save the function call overhead. As Pete says, you could make the args unsigned ints. Or you could cast the test values that way in the asserts and eliminate the tests against zero - e.g. assert( ((unsigned int)(startingPos + offset)) < 8) ); - negative values will turn into very big positive values, and it really just turns into a slightly different machine-language comparison opcode.
5

A look up table?

1 Comment

For the single bit extraction, you would need a table of 256 entries for each of 8 possible bits - which is a 2 KB table if stored in characters (256 bytes if you compress everything and use bit operations to get the values out - but then you're back to where you started). For the ranges, you couldn't sensibly define tables for all 36 possible ranges of bits. Of course, if you have some other structure than a lookup table indexed by byte value and bit position (or bit range), then there may be alternatives, but you'd have to explain that.
3

Another one you do in range of bits:

 ~(0xff << (offset + 1)) --> ~(0xfe << offset) 

As << 1 is nothing more then *2, you can make this operation on your constant (which if you are operating on signle bytes is just getting rid of LSB).

1 Comment

Neat trick. Another way would be to inject (8-offset) zeros at the left by something like (unsigned int)0xff >> (8 - offset), this means there is still an arithmetic operation on offset but it saves the 1-complement operation.
3

You can speed up the first function by first shifting to the right and then masking the bit:

int extractBit(char byte, int pos) { return (byte >> pos) & 0x01; } 

This saves you one operation.

For the second question, I assume that startingPos is the first bit of the chunk you want to extract and offset is how many bits in the chunk you need. Then you could use this:

char extractBitRange(char byte, int startingPos, int offset) { return (byte >> startingPos) & ((1 << offset)-1); } 

Of course you must be careful about the ranges, just as you did in your code.

EDIT: if you want extractBitRange(b,i,0) to behave like extractBit(b,i) and extract a single bit at position i, this variant does that:

return (byte >> startingPos) & ((2 << offset) - 1); 

2 Comments

Should this be "return (byte >> startingPos) & (1U << (offset + 1))" since offset starts from zero ? extractBitRange(3,0) is equivalent to extractBit(3), while extractBitRange(3,1) would fetch bits (3,4) ?
I assumed offset is how many bits you need. if you want the equivalence extractBitRange(x,i,0) == extractBit(x,i) then modify it to return (byte >> startingPos) & ( (1 << (offset+1)) -1 ) Note that (1<<howmany)-1 is a convenient way of getting howmany consecutive one bits, e.g. (1<<3)-1 = 2**3-1 = 8-1 = 7, three consecutive ones. This is useful for masking.
0
int extractBit(int byte, int pos) { if( !((pos >= 0) && (pos < 16)) ) { return 0; } return ( ( byte & (1<<pos) ) >> pos); } int _tmain() { // TODO: Please replace the sample code below with your own. int value; signed int res,bit; value = 0x1155; printf("%x\n",value); //Console::WriteLine("Hello World"); //fun1(); for(bit=15;bit>=0;bit--) { res =extractBit(value,bit); printf("%d",res); } return 0; } 

Comments

-3

If you want to get really quick, you can use a lookup table. I'm guessing that that's what the interviewer was going for (as a final answer to "how can I make this faster").

Basically, that means you create in advance a huge table, mapping every possible combination of parameters to the proper result. For example, you'd have:

byte = 0x0, pos = 0, result = 0 byte = 0x0, pos = 1, result = 0 ... byte = 0x1, pos = 0, result = 1 

Obviously this would need to be put into valid c data structues (arrays, indexed by the byte and pos). This would allow you to, in your function, just return a place in an array, based on whatever indexing scheme you choose.

For the first function, this wouldn't take up too much memory. We're talking about a byte's worth of values (a byte can have 256 different values) times 8 possible values for starting pos, which makes an array of 2048.

For the second function, this would actually take a lot more space. You'd need to multiply 256 times all the possible values for both starting and ending pos (keeping in mind that there are illegal combinations of starting and ending pos).

I'm guessing the interviewer just wanted you to answer that this would be a way to speed it up, and then supply the above thinking into trying to estimate how much space it would cost versus time saved.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.