1

This most likely has been asked and answered before, but my searches was futile.

Question is about bits, bytes masks and checking.

Say one have two "triggers" 0xC4 and 0xC5:

196: 1100 0100 0xc4 197: 1100 0101 0xc5 

The simple way of checking if var is either would be:

if (var == 0xc5 || var == 0xc4) { } 

But sometimes one see this (or the like):

if ( ((var ^ magic) & mask) == 0) { } 

My question is how to find magic and mask. What methods, procedures, tricks etc. is to be utilized to form these values and to assert if any exists?


EDIT:

To clarify. Yes, in this exact example the former would be better then the latter, but my question is more as in general of generating and checking these kinds of masks. Bit twiddling in general. I omitted a lot and tried to make the question simple. But ...

As an example I had a look at the source of OllyDbg decompiler source where one find:

if (((code ^ pd->code) & pd->mask) == 0) FOUND 

Where code is 0 - 3 bytes of command cast from instruction.

unsigned long code = 0; if (size > 0) *(((char *)&code) + 0) = cmd[0]; if (size > 1) *(((char *)&code) + 1) = cmd[1]; if (size > 2) *(((char *)&code) + 2) = cmd[2]; 

As in masking against only bytes part of cmd

And pd is part of:

struct t_cmddata { uint32_t mask; Mask for first 4 bytes of the command uint32_t code; Compare masked bytes with this ... } 

holding a long array as:

const t_cmddata cmddata[] = { /* mask code */ { 0x0000FF, 0x000090, 1,00, NNN,NNN,NNN, C_CMD+0, "NOP" }, { 0x0000FE, 0x00008A, 1,WW, REG,MRG,NNN, C_CMD+0, "MOV" }, { 0x0000F8, 0x000050, 1,00, RCM,NNN,NNN, C_PSH+0, "PUSH" }, { 0x0000FE, 0x000088, 1,WW, MRG,REG,NNN, C_CMD+0, "MOV" }, { 0x0000FF, 0x0000E8, 1,00, JOW,NNN,NNN, C_CAL+0, "CALL" }, { 0x0000FD, 0x000068, 1,SS, IMM,NNN,NNN, C_PSH+0, "PUSH" }, { 0x0000FF, 0x00008D, 1,00, REG,MMA,NNN, C_CMD+0, "LEA" }, { 0x0000FF, 0x000074, 1,CC, JOB,NNN,NNN, C_JMC+0, "JE,JZ" }, { 0x0000F8, 0x000058, 1,00, RCM,NNN,NNN, C_POP+0, "POP" }, { 0x0038FC, 0x000080, 1,WS, MRG,IMM,NNN, C_CMD+1, "ADD" }, { 0x0000FF, 0x000075, 1,CC, JOB,NNN,NNN, C_JMC+0, "JNZ,JNE" }, { 0x0000FF, 0x0000EB, 1,00, JOB,NNN,NNN, C_JMP+0, "JMP" }, { 0x0000FF, 0x0000E9, 1,00, JOW,NNN,NNN, C_JMP+0, "JMP" }, { 0x0000FE, 0x000084, 1,WW, MRG,REG,NNN, C_CMD+0, "TEST" }, { 0x0038FE, 0x0000C6, 1,WW, MRG,IMM,NNN, C_CMD+1, "MOV" }, { 0x0000FE, 0x000032, 1,WW, REG,MRG,NNN, C_CMD+0, "XOR" }, ... 

That would be a typical live example of usage. So again: methods for this. Have been looking at Karnaugh map etc. – but thought there was other and so on method for the same district of operation.

6
  • 1
    "None" would be my quick answer, considering the huge difference in readability between the two! :) Commented Jul 5, 2013 at 10:51
  • Which bits are you interested in? Just check for those bits, no need to use any xor operation. Commented Jul 5, 2013 at 10:53
  • I doubt it was XOR what you saw. I rather think it was & instead, which is bitwise AND. Commented Jul 5, 2013 at 10:55
  • I tried to write an answer. But my brain isn't working right right now. :( Commented Jul 5, 2013 at 11:08
  • @luserdroog: Thanks for the effort, unfortunately I didn't see it ... and for brain not working, I'm right there with you. I'm at a standstill. Whole week. In a period where logic simply does not comply. Commented Jul 5, 2013 at 11:12

2 Answers 2

2

I assume your question is: given a set of "triggers", can we find a mask and magic that the triggers can be checked by the following code

if ( ((var ^ magic) & mask) == 0) { } 

or it is the same as

if ((var & mask) == (magic & mask)) { } 

An example of "triggers" is like

196: 1100 0100 0xc4 197: 1100 0101 0xc5 204: 1100 1100 0xcc 205: 1100 1101 0xcd 

If it is feasible, the bits of "triggers" should be classified into 2 types: "specific bits" and "arbitrary bits". Like the first 4 bits and the 6th and 7th bits, specific bits are the same in each trigger. If your change an arbitrary bit of an trigger, it's still an trigger.

So there are exactly 2^N triggers where N denotes the number of arbitrary bits.

This is my first answer on stackoverflow. I'm not sure if I understand your question correctly. Or are you asking other bit twiddling methods?

Sign up to request clarification or add additional context in comments.

1 Comment

Strictly speaking it is equivalent to (var & mask) == (magic & mask), but of course one would choose magic so that (magic & mask) == magic.
1

Given your two values,

196: 1100 0100 0xc4 197: 1100 0101 0xc5 

you'd want to mask-off the bits that differ, in this case bit 0. So the mask value would be the inverse of 0x01, 0xFE.

ie. 0xC4 & 0xFE == 0xC4, and 0xC5 & 0xFE == 0xC4.

That means both values become 0xC4. Then you can check for 0xC4 by xor-ing with the exact bit pattern that should remain.

 1100 0100 0xC4 

ie. 0xC4 ^ 0xC4 == 0.

 1100 0100 1100 0101 & 1111 1110 1111 1110 ---- ---- ---- ---- 1100 0100 1100 0100 ^ 1100 0100 ---- ---- 0000 0000 

Mask first, or risk utter confusion.


Looking through the actual source file, I kinda think he is trying to be obfuscated. Many of the functions want factoring.

2 Comments

Great. This is good. Do you know of any method for validating these kind of expressions (beside truth tables that quickly can become a bit vast) or coding it in a procedure ... which is kind of "ok lets try this one". And any topic I should read up on for more quickly seeing these kind of patterns. (methods, books, math-topics, etc).
Not yet, but I'm inspired to try to find some. I'm taking a look at the decompiler code. The part you quoted that sets code is totally non-portable (assumes little-endian) and that's a little distracting. :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.