Skip to main content
added 383 characters in body
Source Link
ShadowRanger
  • 157.7k
  • 12
  • 221
  • 315

First obvious error:

K = [math.floor(math.pow(2, 32) * math.fabs(math.sin(i + 1))) for i in range(63)] 

The pseudo-code is using inclusive bounds notation; range is exclusive on the upper bound, so you just made a K that is 63 elements long, when it's supposed to be 64 elements long.

Same issue with your final loop:

for i in range(63): 

Probably ought to check your other bounds to make sure you haven't made similar errors elsewhere.

This is also dead wrong:

msg = ''.join(format(ord(i), 'b') for i in msg0) 

because it doesn't pad the input bytes the way it should, so your bits are all a jumble. There's a lot of ways to do this (you could probably make it work with a format string of '08b' assuming all your characters have ordinal values below 256, though it's a rather inefficient solution).

Also wrong:

for i in range(64): message.append("64") 

You're supposed to append the original length in bits mod 2**64, not the number 64 repeated 64 times (I have no idea where you came up with that).

For your final question, to explain "break chunk into sixteen 32-bit words M[j], 0 ≤ j ≤ 15", "32 bit words" means "integers comprised of 32 binary bits". For a simple example, breaking 0b100100001111 into three four bit words would produce 0b1001, 0b0000, 0b1111.

Unfortunately for you, in Python, ints are infinite precision; they don't drop overflowing bits when a computation produces a result with more than 32 bits, and MD5 calculations assume they do. So you're going to need to simulate it with a lot of masking, by bitwise and-ing via & 0xffffffff.

In short implementing MD5 in plain Python from the pseudocode is a royal pain in the butt, because the pseudocode relies on the limitations of a low level language like C. And you're clearly not being careful enough implementing even the bits that do translate naturally to Python.

First obvious error:

K = [math.floor(math.pow(2, 32) * math.fabs(math.sin(i + 1))) for i in range(63)] 

The pseudo-code is using inclusive bounds notation; range is exclusive on the upper bound, so you just made a K that is 63 elements long, when it's supposed to be 64 elements long.

Same issue with your final loop:

for i in range(63): 

Probably ought to check your other bounds to make sure you haven't made similar errors elsewhere.

For your final question, to explain "break chunk into sixteen 32-bit words M[j], 0 ≤ j ≤ 15", "32 bit words" means "integers comprised of 32 binary bits". For a simple example, breaking 0b100100001111 into three four bit words would produce 0b1001, 0b0000, 0b1111.

Unfortunately for you, in Python, ints are infinite precision; they don't drop overflowing bits when a computation produces a result with more than 32 bits, and MD5 calculations assume they do. So you're going to need to simulate it with a lot of masking, by bitwise and-ing via & 0xffffffff.

First obvious error:

K = [math.floor(math.pow(2, 32) * math.fabs(math.sin(i + 1))) for i in range(63)] 

The pseudo-code is using inclusive bounds notation; range is exclusive on the upper bound, so you just made a K that is 63 elements long, when it's supposed to be 64 elements long.

Same issue with your final loop:

for i in range(63): 

Probably ought to check your other bounds to make sure you haven't made similar errors elsewhere.

This is also dead wrong:

msg = ''.join(format(ord(i), 'b') for i in msg0) 

because it doesn't pad the input bytes the way it should, so your bits are all a jumble. There's a lot of ways to do this (you could probably make it work with a format string of '08b' assuming all your characters have ordinal values below 256, though it's a rather inefficient solution).

Also wrong:

for i in range(64): message.append("64") 

You're supposed to append the original length in bits mod 2**64, not the number 64 repeated 64 times (I have no idea where you came up with that).

For your final question, to explain "break chunk into sixteen 32-bit words M[j], 0 ≤ j ≤ 15", "32 bit words" means "integers comprised of 32 binary bits". For a simple example, breaking 0b100100001111 into three four bit words would produce 0b1001, 0b0000, 0b1111.

Unfortunately for you, in Python, ints are infinite precision; they don't drop overflowing bits when a computation produces a result with more than 32 bits, and MD5 calculations assume they do. So you're going to need to simulate it with a lot of masking, by bitwise and-ing via & 0xffffffff.

In short implementing MD5 in plain Python from the pseudocode is a royal pain in the butt, because the pseudocode relies on the limitations of a low level language like C. And you're clearly not being careful enough implementing even the bits that do translate naturally to Python.

Source Link
ShadowRanger
  • 157.7k
  • 12
  • 221
  • 315

First obvious error:

K = [math.floor(math.pow(2, 32) * math.fabs(math.sin(i + 1))) for i in range(63)] 

The pseudo-code is using inclusive bounds notation; range is exclusive on the upper bound, so you just made a K that is 63 elements long, when it's supposed to be 64 elements long.

Same issue with your final loop:

for i in range(63): 

Probably ought to check your other bounds to make sure you haven't made similar errors elsewhere.

For your final question, to explain "break chunk into sixteen 32-bit words M[j], 0 ≤ j ≤ 15", "32 bit words" means "integers comprised of 32 binary bits". For a simple example, breaking 0b100100001111 into three four bit words would produce 0b1001, 0b0000, 0b1111.

Unfortunately for you, in Python, ints are infinite precision; they don't drop overflowing bits when a computation produces a result with more than 32 bits, and MD5 calculations assume they do. So you're going to need to simulate it with a lot of masking, by bitwise and-ing via & 0xffffffff.