This is a FAQ in the C# tag, hard to find the duplicate. The need for the cast is relevant first. The underlying reason is that the CLI only specifies a limited number of valid types for the Opcodes.Add IL instruction. Only operands of type Int32, Int64, Single, Double and IntPtr are supported. IntPtr is special as well, the C# language forbids using that one.
So the C# compiler has to use an implicit conversion to uplift the byte to a type that the operator supports. It will pick Int32 as the closest compatible type. The result of the addition is Int32. Which does not fit back into a byte without truncating the result, throwing away overflowthe extra bits. And An obvious example is 127255 + 1, the result is 256 in Int32 but doesn't fit a Byte and yields 0 when stored.
That's a problem, the language designers didn't like that truncatingtruncation to happen without you explicitly acknowledging that you are aware of the consequences of doing so. Cast A cast is required to convince the compiler that you're aware. Bit of a cop-out of course, you tend to produce the cast mechanically without thinking much about the consequences. But that makes it your problem, not Microsoft's :)
The rub was the += operator, a very nice operator to write condense code. Resembles the brevity of the var keyword. Rock and a hard place however, where do you put the cast? Doesn't It just doesn't work so they punted the problem and allowed truncation without a cast. Arbitrarily declare using += as "knowing what you're doing"
Notable is the way VB.NET works, it doesn't require a cast. But it gives a guarantee that C# doesn't provide by default, it will generate an OverflowException when the result doesn't fit. Pretty nice, but that check doesn't come for free.
Designing clean languages is a very hard problem. The C# team did an excellent job, warts not withstanding. Otherwise the kind of warts brought on by processor design. Value types being the most obvious wart of having to deal with IL has these type restrictions because that's what real 32-bit processors have too, particularly the kind of havocRISC designs that were popular in the hardware engineers create90s. Their internal registers can only handle 32-bit integers and IEEE-754 floating point. And threadingonly permit smaller types in loads and stores. The Intel x86 core is very popular and actually permits basic operations on smaller types. But that's mostly a historical accident due to Intel keeping the design compatible through the 8-bit 8080 and 16-bit 8086 generations. It doesn't come for free, 16-bit operations costs an extra cpu cycle. To be avoided.