Why doesn't Go specifiy padding content for struct comparaison? [closed]

Question

From Dave Cheney's article about struct comparaison code generated by Go compiler (https://dave.cheney.net/2020/05/09/ensmallening-go-binaries-by-prohibiting-comparisons):

Padding exists to ensure the correct field alignments, and while it does take up space in memory, the contents of those padding bytes are unknown. You might assume that, being Go, the padding bytes are always zero, but it turns out that’s not the case–the contents of padding bytes are simply not defined. Because they’re not defined to always be a certain value, doing a bitwise comparison may return false because the nine bytes of padding spread throughout the 24 bytes of S [a previously defined struct with padding] may not be the same.

The Go compiler solves this problem by generating what is known as an equality function. In this case S‘s equality function knows how to compare two values of type S by comparing only the fields in the function while skipping over the padding.

EDIT: the same source states that struct {int64, int64} are compared using memory compare, while struct {int64, int8} requires a custom function because of padding, enlarging the resulting binary.

Why doesn't Go compiler solve this by defining padding bytes content, and so it can compare using something like memcmp instead?

EDIT: Is there any overhead in zeroing or comparing one word instead of one byte (e.g.: zeroing and comparing 16 bytes instead of 9 in the previous struct {int64, int8} example)?

Mainly because memcmp doesn't do what Go needs. memcmp on string fields would not implement what Go requires from a string comparison. It has to use the fields anyway. — Volker
– Volker, Commented May 11, 2020 at 14:57
memcmp could see a struct{ int16, int16 } as equal to a struct{ int32 }, even though they can never be equal in Go. Padding is irrelevant. — Adrian
– Adrian, Commented May 11, 2020 at 15:36
I would add to the aswer and the comments that your point of view might be a bit offset by fixating on memcmp. A compiler may be able to generate highly effective type-specific comparison code. Even C compilers will try to inline calls to memcmp when they can be sure the semantics of this symbol were not messed up by a programmer (say, by redefining that symbol). Go compilers are free to generate effective comparison code on a type-by-type basis, right away (since Go is much stricter than C when it comes to typing). — kostix
– kostix, Commented May 11, 2020 at 16:57
Worth noting: even (or especially?) in C or C++, the padding areas can cause problems: hashing the raw bytes of a struct, for fast lookup, fails to find matching structs when it's field-by-field match that we care about and use later. I fixed a few bugs like this in the last few years... — torek
– torek, Commented May 11, 2020 at 19:58
According to the source a struct{int64,int64} is compared using "memcmp". A struct{int8,int64} cannot because of padding is not zeroed out, at least it's not required by the spec. Of course memcmp shouldn't be used on structs that are not comparable, as struct{int64} and struct{int32,int32}, it's a compilation error. memcmp would fail on strings too. The article is about the size of Go binaries growing when there are many comparable structs. A one-fits-all memcmp-like for padded structures could help for that. Zeroing or comparing padding would make a real impact on performance? — neclepsio
– neclepsio, Commented May 11, 2020 at 20:33

chash · Accepted Answer · 2020-05-11 23:10:20Z

From the spec:

Struct values are comparable if all their fields are comparable. Two struct values are equal if their corresponding non-blank fields are equal.

In other words, struct equality is not a simple byte-by-byte comparison. It is a field-by-field comparison using the rules for comparability/equality of each field.

Edit:

Even if padding were zeroed, many structs could still not be compared directly using something like memcmp. Types like strings and interfaces are not memory comparable due to their underlying type representations, even though they are logically comparable and can be equal according to the spec. To see this in action, look at https://play.golang.org/p/lmu-THnWY3W.

If you want to see how struct equality is implemented, check out the source code.

Ok, let me restate the question. Some structs can be compared by memory compare, and the compiler does that. Padding content is not defined, so those structs can not. So, why not define padding must be zeroed?
@neclepsio I can't speak for the Go authors, but maybe because it still wouldn't allow memory-based comparison of all structs, like those containing strings, interfaces, other structs containing those things, etc. It seems like an edge case and zeroing the padding likely has some measurable overhead cost.
Thank you, that's what I think too, but I don't understand how, for example, on a 64-bit architecture it could take different time to zero or compare int64s or int8s.

Collectives™ on Stack Overflow

Why doesn't Go specifiy padding content for struct comparaison? [closed]

1 Answer 1

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Linked

Related