Why is cache coherency important in multi-processor system?

Question

Multiprocessor systems have some kind of cache coherency protocols built into them e.g. MSI, MESI etc. The only case where cache coherency matters is when instructions executing in two different processors tries to write/read shared data. For the shared data to be practically valid, programmer anyway has to introduce memory barriers. If there is no memory barrier, the shared data is going to be "wrong" regardless of whether underlying processor implements cache coherence or not. Why then the need of cache coherence mechanisms at hardware level?

Without cache coherency, memory-ordering barriers wouldn't be sufficient to make data visible between cores. Also, not true that barriers are needed. An atomic counter can work for some purposes with std::memory_order_relaxed, i.e. just atomicity, no ordering wrt. other operations. Perhaps you're misunderstanding exactly what barriers do: Does a memory barrier ensure that the cache coherence has been completed?. Also When to use volatile with multi threading? discuses coherence making hand-rolled C atomics work — Peter Cordes
– Peter Cordes, Commented Nov 20, 2021 at 22:55
Not just shared data, also adjacent data in the same cache line. — root
– root, Commented Nov 21, 2021 at 6:34
What I meant was that how do guarantees weaken (or program executes wrongly) when processor runs cache coherency only when memory barrier is encountered? And not running cache coherency protocols till the next memory barrier instruction. — driewguy
– driewguy, Commented Nov 22, 2021 at 21:01
Re your attempted answer: cache coherency is always maintained, not broken and restored after writes. So even before a write can become visible to other cores, the writing core needs to exclusive ownership of the cache line, in MESI-style systems with an RFO (read for ownership). That happens after the store executes and puts data into the (per-core-private) store buffer, but must complete before the store can commit from the SB to L1d cache. Can a speculatively executed CPU branch contain opcodes that access RAM? — Peter Cordes
– Peter Cordes, Commented Dec 4, 2021 at 22:03

janneb · Accepted Answer · 2021-11-20 19:06:07Z

Without cache coherency, instead of merely barriers, you'd have to flush and invalidate caches when accessing shared data, which has a much higher overhead than cache coherency.

Historically, there have been a few shared memory multiprocessor architectures, but they have all died out in favor of CC due to being very difficult to program correctly and efficiently.

Collectives™ on Stack Overflow

Why is cache coherency important in multi-processor system?

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related