Trying to find a checksum (possible CRC?) algorithm for CAN messages on a Porsche

Question

I have the following problem. Besides CAN itself having a full checksum and stuff in the protocol, I have data sets where the first byte of the CAN message is some checksum generated in some unknown way and the second byte is some kind of counter. I have some examples and I am trying to find out how to calculate that first byte if I modify any data in the payload (bytes 3 through 8 of the CAN message).

How to read the data:

timestamp canid length byte1 byte2 byte3 byte4... byte8

I already verified that both the CAN ID and the length (total number of bytes) have an influence on the checksum, the time does not, so you can ignore the leading number (50.533 for example)

DATA SET 1:

50.533 492 3 39 00 00 50.633 492 3 37 01 00 50.732 492 3 25 02 00 50.831 492 3 2B 03 00 50.931 492 3 01 04 00 51.030 492 3 0F 05 00 51.130 492 3 1D 06 00 51.229 492 3 13 07 00 51.329 492 3 49 08 00 51.428 492 3 47 09 00 51.527 492 3 55 0A 00 51.627 492 3 5B 0B 00 51.726 492 3 71 0C 00 51.826 492 3 7F 0D 00 51.925 492 3 6D 0E 00 52.025 492 3 63 0F 00

DATA SET 2:

11.270 3C0 4 66 00 00 00 11.369 3C0 4 D3 01 00 00 11.469 3C0 4 23 02 00 00 11.568 3C0 4 96 03 00 00 11.668 3C0 4 EC 04 00 00 11.767 3C0 4 59 05 00 00 11.867 3C0 4 A9 06 00 00 11.966 3C0 4 1C 07 00 00 12.066 3C0 4 5D 08 00 00 12.166 3C0 4 E8 09 00 00 12.266 3C0 4 18 0A 00 00 12.365 3C0 4 AD 0B 00 00 12.465 3C0 4 D7 0C 00 00 12.564 3C0 4 62 0D 00 00 12.664 3C0 4 92 0E 00 00 12.763 3C0 4 27 0F 00 00

DATA SET 3:

25.462 3C0 4 9B 00 23 00 25.561 3C0 4 2E 01 23 00 25.661 3C0 4 DE 02 23 00 25.760 3C0 4 6B 03 23 00 25.860 3C0 4 11 04 23 00 25.960 3C0 4 A4 05 23 00 26.059 3C0 4 54 06 23 00 26.159 3C0 4 E1 07 23 00 26.258 3C0 4 A0 08 23 00 26.358 3C0 4 15 09 23 00 26.458 3C0 4 E5 0A 23 00 26.557 3C0 4 50 0B 23 00 26.657 3C0 4 2A 0C 23 00 26.756 3C0 4 9F 0D 23 00 26.856 3C0 4 6F 0E 23 00 26.955 3C0 4 DA 0F 23 00

As you can see, data sets 2 and 3 only differ by that 0x23 byte in data spot 3 and the checksum is completely different for all 16 counter positions.

Anyone got a hint? I already tried reveng but could't come up with anything :(

Please post the entirety of the messages, as many as you can. — pythonpython
– pythonpython, Commented Apr 25, 2021 at 3:08
That's all I have. After that they repeat. Like the counter restarts after 0F goes back to 00 and the checksums repeat. So I only posted unique messages, not any repeats. — Stefan
– Stefan, Commented Apr 26, 2021 at 8:09
Your messages are 5 or 6 bytes, what are the missing bytes for each? The 5 byte messages, for example? Zeros? — pythonpython
– pythonpython, Commented Apr 26, 2021 at 15:12
No there aren't any other bytes. In CAN BUS there can be a 4 byte message like this: 3C0 4 9B 00 23 00 it means time ID 3C0, 4 bytes length and then there are the 4 bytes. I did not omit any data :) — Stefan
– Stefan, Commented Apr 28, 2021 at 2:37

Edward · Accepted Answer · 2022-12-03 23:04:27Z

The answer

See the code at the end for how to calculate this, but if you're interested, here are the details explaining how I got there. Throughout this text, numbers are mostly in hex, so keep that in mind.

The process

For the first data set, I noticed that each set bit seemed to correspond to a particular XOR difference. For example in the first two samples only bit 0 changes and the XOR difference is 39^37 = 0e

We can quickly spot a pattern if we express in binary as well as hex:

04 92 39 00 00 04 92 37 01 00 ; bit 1 = 0e 00001110 04 92 25 02 00 ; bit 2 = 1c 00011100 04 92 01 04 00 ; bit 3 = 38 00111000 04 92 49 08 00 ; bit 4 = 70 01110000

So as the bit position moves one to the left, so does the difference. Nice and easy. So with this information we can calculate all the other values, so that 0A should be 39 ^ 1c ^ 70 = 55 and indeed, that's what your data shows:

04 92 55 0A 00

We can try the same thing with data set 2:

03 C0 66 00 00 00 03 C0 D3 01 00 00 ; bit 1 = b5 10110101 03 C0 23 02 00 00 ; bit 2 = 45 01000101 03 C0 EC 04 00 00 ; bit 4 = 8a 10001010 03 C0 5D 08 00 00 ; bit 8 = 3b 00111011

We can try the same thing with data set 3:

03 C0 9B 00 23 00 03 C0 2E 01 23 00 ; bit 1 = b5 10110101 03 C0 DE 02 23 00 ; bit 2 = 45 01000101 03 C0 11 04 23 00 ; bit 4 = 8a 10001010 03 C0 A0 08 23 00 ; bit 8 = 3b 00111011

So they both use the same pattern, but it's no longer just a left shift. If we rotate b5 left by one bit we get 6b. If we the xor that with 2e we get 45. So the pattern seems to be this:

m <<= 1 if (m & 100) m ^= 12f

This looks very much like an 8-bit CRC with polynomial 0x2f.

Since we only have mask values for the low four bits, we can only guess that the next four bits also have this same pattern, so the extended tables for each would look like this:

01 0e b5 02 1c 45 04 38 8a 08 70 3b 10 e0 76 20 ef ec 40 f1 f7 80 cd c1

We can keep going:

0100 b5 ad 0200 45 75 0400 8a ea 0800 3b fb 1000 76 d9 2000 ec 9d 4000 f7 15 8000 c1 2a

It's interesting that the mask values for data set 1, if extended beyond one byte, exactly match the values for data sets 2 and 3. This suggests that the mask values may be positional. If we start with the all zeroes message in data set 2:

03 C0 66 00 00 00

And then try this theory with an arbitrary message from data set 3, starting with the initial value of 66:

03 C0 15 09 23 00

Starting from the back we have 00, which contributes nothing so we still have 66. For 23 we use the first set of mask values:

23 => ef ^ 1c ^ 0e = fd 66 ^ fd = 9b

For the 09 we use the next set of mask values:

09 => 3b ^ b5 = 8e 9b ^ 8e = 15

This works and gives us the correct value. So now we're only left with the question of the initial value 66 or in the case of the shorter messages, 39.

What if we work backwards from the initial mask value of 0e? Assuming that other data byte and then the address bytes are processed, we need to calculate three bytes worth of mask values. To go backwards we use this:

if (m & 1) m ^= 12f m >= 1 800000 07 400000 94 200000 4a 100000 25 080000 85 040000 d5 020000 fd 010000 e9 008000 e3 004000 e6 002000 73 001000 ae 000800 57 000400 bc 000200 5e 000100 2f 000080 80 000040 40 000020 20 000010 10 000008 08 000004 04 000002 02 000001 01

That's very interesting, because the first byte's value is simply itself, which would make programming this very simple.

The question

Now that we have that worked out, how do the initial values get calculated? We have three 00 values from the three data sets. Perhaps the table above that we just calculated could be used to derive the initial value based on the address and an additional zero byte?

04 92 39 00 00 03 C0 66 00 00 00

We don't know the order of the address bytes, so we calculate both ways:

0492 => 2e nope 9204 => 17 nope 03c0 => b1 nope c003 => 06 nope

Apparently it's not that simple. Perhaps the address is preprocessed by xoring it with some fixed value? If that's the case, then the xor of the two addresses should produce a value that can be used too obtain the initial value.

0492 ^ 03c0 = 752

However, nothing seems to work with that approach either.

One possibility I haven't tried yet is that the initial value is a CRC8 of the address, but that's just a guess.

Update: I tried it and that's exactly what it was. The bytes are processed sequentially from just after the checksum to the end of the message and then finally the XOR'd bytes of the ID.

The code

#include <cstdint> #include <initializer_list> #include <iostream> #include <numeric> #include <vector> class PorscheCanMessage : public std::vector<std::uint8_t> { public: PorscheCanMessage(std::initializer_list<std::uint8_t> l); std::uint8_t checksum() const; }; static std::vector<PorscheCanMessage> samples{ {0x04,0x92,0x39,0x00,0x00}, {0x04,0x92,0x37,0x01,0x00}, {0x04,0x92,0x25,0x02,0x00}, {0x04,0x92,0x2B,0x03,0x00}, {0x04,0x92,0x01,0x04,0x00}, {0x04,0x92,0x0F,0x05,0x00}, {0x04,0x92,0x1D,0x06,0x00}, {0x04,0x92,0x13,0x07,0x00}, {0x04,0x92,0x49,0x08,0x00}, {0x04,0x92,0x47,0x09,0x00}, {0x04,0x92,0x55,0x0A,0x00}, {0x04,0x92,0x5B,0x0B,0x00}, {0x04,0x92,0x71,0x0C,0x00}, {0x04,0x92,0x7F,0x0D,0x00}, {0x04,0x92,0x6D,0x0E,0x00}, {0x04,0x92,0x63,0x0F,0x00}, {0x03,0xC0,0x66,0x00,0x00,0x00}, {0x03,0xC0,0xD3,0x01,0x00,0x00}, {0x03,0xC0,0x23,0x02,0x00,0x00}, {0x03,0xC0,0x96,0x03,0x00,0x00}, {0x03,0xC0,0xEC,0x04,0x00,0x00}, {0x03,0xC0,0x59,0x05,0x00,0x00}, {0x03,0xC0,0xA9,0x06,0x00,0x00}, {0x03,0xC0,0x1C,0x07,0x00,0x00}, {0x03,0xC0,0x5D,0x08,0x00,0x00}, {0x03,0xC0,0xE8,0x09,0x00,0x00}, {0x03,0xC0,0x18,0x0A,0x00,0x00}, {0x03,0xC0,0xAD,0x0B,0x00,0x00}, {0x03,0xC0,0xD7,0x0C,0x00,0x00}, {0x03,0xC0,0x62,0x0D,0x00,0x00}, {0x03,0xC0,0x92,0x0E,0x00,0x00}, {0x03,0xC0,0x27,0x0F,0x00,0x00}, {0x03,0xC0,0x9B,0x00,0x23,0x00}, {0x03,0xC0,0x2E,0x01,0x23,0x00}, {0x03,0xC0,0xDE,0x02,0x23,0x00}, {0x03,0xC0,0x6B,0x03,0x23,0x00}, {0x03,0xC0,0x11,0x04,0x23,0x00}, {0x03,0xC0,0xA4,0x05,0x23,0x00}, {0x03,0xC0,0x54,0x06,0x23,0x00}, {0x03,0xC0,0xE1,0x07,0x23,0x00}, {0x03,0xC0,0xA0,0x08,0x23,0x00}, {0x03,0xC0,0x15,0x09,0x23,0x00}, {0x03,0xC0,0xE5,0x0A,0x23,0x00}, {0x03,0xC0,0x50,0x0B,0x23,0x00}, {0x03,0xC0,0x2A,0x0C,0x23,0x00}, {0x03,0xC0,0x9F,0x0D,0x23,0x00}, {0x03,0xC0,0x6F,0x0E,0x23,0x00}, {0x03,0xC0,0xDA,0x0F,0x23,0x00}, }; PorscheCanMessage::PorscheCanMessage(std::initializer_list<std::uint8_t> l) { reserve(l.size()); insert(end(), l.begin(), l.end()); } const std::uint8_t crc(std::uint8_t crc, std::uint8_t data) { constexpr uint8_t poly{0x2f}; crc ^= data; for (unsigned bits{8}; bits; --bits) { if (crc & 0x80) { crc = (crc << 1) ^ poly; } else { crc <<= 1; } } return crc; } std::uint8_t PorscheCanMessage::checksum() const { auto it{cbegin() + 3}; uint8_t crcval = 0xff; crcval = std::accumulate(cbegin() + 3, cend(), 0xff, crc); uint8_t combined_id{static_cast<uint8_t>(at(0) ^ at(1))}; crcval = crc(crcval, combined_id); return crcval ^ 0xff; } int main() { for (const auto& m : samples) { std::cout << std::hex << ", message: " << static_cast<unsigned>(m.at(2)) << ", calculated: " << static_cast<unsigned>(m.checksum()) << '\n'; } }

Dude!! Thank you so much for figuring this out and the detailed steps too! I am still reading and trying to understand it all, so that I can in the future figure similar things out myself... I do have a few questions and would love to chat!! How can I get in touch with you? — Stefan
– Stefan, Commented Dec 5, 2022 at 19:14
I finished my project!!! Thanks to this help here and some more research I finally got to the bottom of it all. JUST IN CASE ANYONE FINDS THIS PAGE on the same mission, Porsche (Volkswagen) definitely came up with the funkiest way to calculate a checksum. There is a lot of information on this page here: github.com/commaai/opendbc/blob/master/can/common.cc — Stefan
– Stefan, Commented Apr 13, 2023 at 21:54

Stack Exchange Network

Trying to find a checksum (possible CRC?) algorithm for CAN messages on a Porsche

1 Answer 1

The answer

The process

The question

The code

Hot Network Questions

Trying to find a checksum (possible CRC?) algorithm for CAN messages on a Porsche

1 Answer 1

The answer

The process

The question

The code

Related

Hot Network Questions