Is the load part of a read-modify-write operation of atomic object guaranteed to read the last value in modification order compared to load operation?

Question

Consider this example:

std::atomic<int> v = {0}; // thread 1: for(int i = 0; i<999999;i++) v.load(memory_order::seq_cst); // #1 v.exchange(2,memory_order::seq_cst); // #2 //thread 2: v.store(1,memory_order::seq_cst); // #3

Assuming the modification order of v is {0, 1} where 0 is written by the initialization of v and 1 is stored by #3 in thread 2 and both #1 and #2 does not happen before #3. According to [intro.races] p14 ~ p18:

The value of an atomic object M, as determined by evaluation B, is the value stored by some unspecified side effect A that modifies M, where B does not happen before A.

If an operation A that modifies an atomic object M happens before an operation B that modifies M, then A is earlier than B in the modification order of M.

If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B is either the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M.

If a value computation A of an atomic object M happens before an operation B that modifies M, then A takes its value from a side effect X on M, where X precedes B in the modification order of M.

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B takes its value from X or from a side effect Y that follows X in the modification order of M.

the load operation at #1 is permitted to always read 0 until the end of the loop, it also can be 1. However, the load part of #2 must read 1 that is written by #3 as per [atomics.order] p10

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

In this case, if #1 always reads 0 in its loop, it also does not violate the single total: #1 is sequenced before #2, which also means #1 strongly happens before #2, so, in the single total order, #1 precedes #2, since #1 reads 0 and 0 precedes 1(produced by #3) in the modification order of v, according to [atomics.order] p3

An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if

A is a modification, and B reads the value stored by A, or

[...]

A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or

#1 is coherence-ordered before #3, hence, #1 precedes #3 in the single total order. Finally, since #2 reads the value written by #3, #3 is coherence-ordered before #2, hence #3 precedes #2 in the single total order. which all three requirements

#1 precedes #2

#1 precedes #3

#3 precedes #2

can form a valid single total {#1, #3, #2}

Hence, with the pre-conditions, #1 can always load 0, and #2 must read the value 1. Is this a possible output?

As far as I can see, any of three things can happen: (A) all #1 loads read 0 and #2 also reads 0; (B) all #1 loads read 0 and #2 reads 1; (C) at least one of the #1 loads read 1, and so do all subsequent #1 as well as #2. Each case puts #3 somewhere different in the total order, but all are consistent with #1 and #2 not happening before #3. (Your assumption that the modification order of v is {0,1} is not quite possible because the value 2 must also be included; it can either be {0,1,2} or {0,2,1}.) — Nate Eldredge
– Nate Eldredge, Commented Sep 6, 2024 at 2:49
Indeed, regardless of the program's behavior, there is no case in which we could possibly conclude that either #1 or #2 happens before #3, as thread 2 doesn't contain any operation that could provide synchronization in that direction. So your assumption (#1 and #2 don't happen before #3) doesn't restrict the program's behavior in any way. — Nate Eldredge
– Nate Eldredge, Commented Sep 6, 2024 at 2:52
@NateEldredge Given the assumption that the modification order is {0,1}, <<all #1 loads read 0 and #2 also reads 0>> why can#2 load 0? — xmh0511
– xmh0511, Commented Sep 6, 2024 at 3:52
@NateEldredge Please look at this question stackoverflow.com/questions/79118090/…, I think it expresses clearer than this question. — xmh0511
– xmh0511, Commented Oct 23, 2024 at 13:26

HolyBlackCat · Accepted Answer · 2024-09-06 07:10:22Z

I think you're overthinking it.

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

This just says that RMWs are invidivisable, i.e. no writes can be inserted between the read part and the write part. (In other words, they say that the "write" part of RMW appears in the modification order right after the value that was read by the "read" part, that's all).

This doesn't say that RMWs are any less prone to reading stale values than non-RMW reads. (Maybe that's what happens in practice on some platforms, but that's above my paygrade and isn't the point.)

This presumed difference doesn't affect how you analyze the programs anyway. If all #1s returned 0, and #2 suddenly returned 1, is it because, #1 was reading a "stale" value, or because #3 happened to happen exactly between the last #1 and #2? You can't tell, and it doesn't matter.

Also operations on v have to be consistent with the global seq-cst order, but that doesn't tell you much. (If all #1s returned 0 and #2 returned 1, then #3 lands between the latest #1 and #2. If some #1 returns 1, then #3 is right before that, etc.)

[assume that] both #1 and #2 does not happen before #3

They can't happen-before #3 in the first place, I believe (for A to happen-before B in a different thread, there has to be a synchronizes-with relationship, so either B must be a read, or must have a preceding read in the same thread). At best you can say they're coherence-ordered-before, or that #2 precedes #3 in the modification order of v.

In general, I'd start my assumptions with what value each load returns, and deduce the various "...-before" orderings from there, not the other way around.

This just sounds like they want to guarantee that RMWs are invidivisable, i.e. no writes can be inserted between the read part and the write part. As said in the question, I gave the assumption that the modification is {0, 1}, when the RMW wrote it side effect to the modification order, its read value must be the latest one before its written value, that is 1, isn't it?
@xmh0511 Yeah, this is what you said in the question. I think I addressed this in the answer. Again: 1. I don't think "last" means what you think it does. They just mean that the write part of RMW appears in the modification order right after the value that was read by the read part, that's all. They aren't saying that RMWs are less prone to reading stale values. And 2. this assumption has no practical effect anyway, it doesn't help you reason about the program behavior. If all #1s return 0 and #2 returns 1, ...
Or, another way to think about it is that there's no such thing as the "current" modification order in the first place. There's only the (timeless) modification order as it will be at the end of execution, and you can use the most recent load to judge where you (a specific thread) are in it currently.
@xmh0511 Again, strictly speaking there's no such thing as the "current" order. You only know the value returned by the most recent order, and can use it to judge about where in the final (as it will be at the end of execution) modification order you currently are.

Quuxplusone · Accepted Answer · 2024-09-07 20:31:38Z

The text of your question doesn't match its title. The title mentions "read-modify-write operations," but in your question itself, you only use the following operations:

v.load() (read but not write)
v.store(1) (write but not read)
v.exchange(2) (read, then write)

A "read-modify-write operation" would be like compare_exchange_strong, or fetch_add (i.e. +=) — something that reads the atomic's old value, computes a new value based on ("modifying") that old value, and then writes the new value.

All atomic operations, including compare_exchange_strong and fetch_add, happen atomically, which means there's no way for any other thread to sneak a modification in between the read and the write. In other words, atomic operations are atomic. This is not the case for operations that are "non-atomic" — i.e., composed of smaller operations. For example:

int i = v.exchange(2); // A: Set v to 2 and load its // immediately previous value into i. // This is one atomic operation. int i = v.load(); v.store(2); // B: Load v's value into i, // *then* set v to 2. // This is two atomic operations.

Suppose v is initially zero, and some other thread is doing int j = v.exchange(3) at the same time that we're executing one of the snippets above. Then it's possible for the second snippet to observe i=0 and j=0; whereas the first snippet can only ever observe either i=0 and j=2, or i=3 and j=0.

Does that help?

eel.is/c++draft/atomics#types.operations-19 exchange is RMW operation
Sure, in the context of defining a release sequence ( eel.is/c++draft/intro.multithread#intro.races-5 ) — but you don't seem to be asking about that. Step 1 is just to understand what C++ means by "atomic"; and then, I claim, there is no Step 2 needed. :)

Collectives™ on Stack Overflow

Is the load part of a read-modify-write operation of atomic object guaranteed to read the last value in modification order compared to load operation?

2 Answers 2

26 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

26 Comments

2 Comments

Linked

Related