CRC32 calculation in Python without using libraries

Question

I have been trying to get my head around CRC32 calculations without much success, the values that I seem to get do not match what I should get.

I am aware that Python has libraries that are capable of generating these checksums (namely zlib and binascii) but I do not have the luxury of being able to use them as the CRC functionality do not exist on the micropython.

So far I have the following code:

import binascii import zlib from array import array poly = 0xEDB88320 table = array('L') for byte in range(256): crc = 0 for bit in range(8): if (byte ^ crc) & 1: crc = (crc >> 1) ^ poly else: crc >>= 1 byte >>= 1 table.append(crc) def crc32(string): value = 0xffffffffL for ch in string: value = table[(ord(ch) ^ value) & 0x000000ffL] ^ (value >> 8) return value teststring = "test" print "binascii calc: 0x%08x" % (binascii.crc32(teststring) & 0xffffffff) print "zlib calc: 0x%08x" % (zlib.crc32(teststring) & 0xffffffff) print "my calc: 0x%08x" % (crc32(teststring))

Then I get the following output:

binascii calc: 0xd87f7e0c zlib calc: 0xd87f7e0c my calc: 0x2780810c

The binascii and zlib calculations agree where as my one doesn't. I believe the calculated table of bytes is correct as I have compared it to examples available on the net. So the issue must be the routine where each byte is calculated, could anyone point me in the correct direction?

Thanks in advance!

PM 2Ring · Accepted Answer · 2017-01-10 15:13:44Z

I haven't looked closely at your code, so I can't pinpoint the exact source of the error, but you can easily tweak it to get the desired output:

import binascii from array import array poly = 0xEDB88320 table = array('L') for byte in range(256): crc = 0 for bit in range(8): if (byte ^ crc) & 1: crc = (crc >> 1) ^ poly else: crc >>= 1 byte >>= 1 table.append(crc) def crc32(string): value = 0xffffffffL for ch in string: value = table[(ord(ch) ^ value) & 0xff] ^ (value >> 8) return -1 - value # test data = ( '', 'test', 'hello world', '1234', 'A long string to test CRC32 functions', ) for s in data: print repr(s) a = binascii.crc32(s) print '%08x' % (a & 0xffffffffL) b = crc32(s) print '%08x' % (b & 0xffffffffL) print

output

'' 00000000 00000000 'test' d87f7e0c d87f7e0c 'hello world' 0d4a1185 0d4a1185 '1234' 9be3e0a3 9be3e0a3 'A long string to test CRC32 functions' d2d10e28 d2d10e28

Here are a couple more tests that verify that the tweaked crc32 gives the same result as binascii.crc32.

from random import seed, randrange print 'Single byte tests...', for i in range(256): s = chr(i) a = binascii.crc32(s) & 0xffffffffL b = crc32(s) & 0xffffffffL assert a == b, (repr(s), a, b) print('ok') seed(42) print 'Multi-byte tests...' for width in range(2, 20): print 'Width', width r = range(width) for n in range(1000): s = ''.join([chr(randrange(256)) for i in r]) a = binascii.crc32(s) & 0xffffffffL b = crc32(s) & 0xffffffffL assert a == b, (repr(s), a, b) print('ok')

output

Single byte tests... ok Multi-byte tests... Width 2 Width 3 Width 4 Width 5 Width 6 Width 7 Width 8 Width 9 Width 10 Width 11 Width 12 Width 13 Width 14 Width 15 Width 16 Width 17 Width 18 Width 19 ok

As discussed in the comments, the source of the error in the original code is that this CRC-32 algorithm inverts the initial crc buffer, and then inverts the final buffer contents. So value is initialised to 0xffffffff instead of zero, and we need to return value ^ 0xffffffff, which can also be written as ~value & 0xffffffff, i.e. invert value and then select the low-order 32 bits of the result.

You sir are a Godsend, thank you very much for your fast reply and solution!
@Cooper No worries. I'm not 100% confident of my tweak (due to mixing arithmetic with bitwise operations). It appears to do the job correctly, but I'm a little concerned that it might give the wrong answer in some corner case. OTOH, I just checked that it returns ffffffff when passed '\xff\xff\xff\xff', so that's a good sign. :)
@Cooper After those additional tests, my confidence has increased. :) I'd be very surprised if it returns the wrong result for any input.
it appears that 'return (value ^ 0xffffffff)' will remove the need to do an and operation on the result afterwards. Bitwise arithmetic wasn't my strong point and it has been a while since. Thanks again.
@Cooper Ah, of course! :) Another option is return ~value & 0xffffffff. Both of those are cleaner than return (-1 - value) & 0xffffffff. Your version is probably the best, since it uses the least number of operations.

Rob Milne · Accepted Answer · 2023-01-04 20:16:11Z

If using binary data where the crc is chained over multiple buffers I used the following (using the OPs table):

def crc32(data, crc=0xffffffff): for b in data: crc = table[(b ^ crc) & 0xff] ^ (crc >> 8) return crc

One can XOR the final result with -1 to agree with the online calculators.

crc = crc32(b'test') print('0x{:08x}'.format(crc)) crc = crc32(b'te') crc = crc32(b'st', crc) print('0x{:08x}'.format(crc)) print('xor: 0x{:08x}'.format(crc ^ 0xffffffff))

output

0x278081f3 0x278081f3 xor: 0xd87f7e0c

Collectives™ on Stack Overflow

CRC32 calculation in Python without using libraries

2 Answers 2

6 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Linked

Related