Visibility of atomic operations with seq-cst fences in C++20

Question

Until C++17 the standard contained the following paragraph (C++17 Section 32.4 [atomics.order] paragraph 6):

For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there are memory_order::seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B observes either the effects of A or a later modification of M in its modification order.

Suppose we have a StoreLoad reordering litmus test:

// Thread 1
a.store(1, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst); // F1
b.load(std::memory_order_relaxed);

// Thread 2
b.store(1, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst); // F2
a.load(std::memory_order_relaxed);

We have two seq-cst fences F1 and F2. My understanding of the paragraph above is that if F1 is ordered before F2 in S, the a.load() by Thread 2 is guaranteed to observe the value from the a.store() by Thread 1 (or some later change). If however F2 is ordered before F1, then the b.load() by Thread 1 is guaranteed to observe the value written by the b.store() in Thread 2.

However, in an attempt to "Strengthen SC fences" for C++20, P0668R5 proposed to replace paragraphs 3-7 with:

An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if

A is a modification, and B reads the value stored by A, or

A precedes B in the modification order of M, or

A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or

there exists an atomic modification X of M such that A is coherence-ordered before X and X is coherence-ordered before B.

There is a single total order S on all memory_order::seq_cst operations, including fences, that satisfies the following constraints. First, if A and B are memory_order::seq_cst operations and A strongly happens before B, then A precedes B in S. Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:

if A and B are both memory_order::seq_cst operations, then A precedes B in S; and

if A is a memory_order::seq_cst operation and B happens before a memory_order::seq_cst fence Y, then A precedes Y in S; and

if a memory_order::seq_cst fence X happens before A and B is a memory_order::seq_cst operation, then X precedes B in S; and

if a memory_order::seq_cst fence X happens before A and B happens before a memory_order::seq_cst fence Y, then X precedes Y in S.

[ Note: This definition ensures that S is consistent with the modification order of any atomic object M. It also ensures that a memory_order::seq_cst load A of M gets its value either from the last modification of M that precedes A in S or from some non-memory_order::seq_cst modification of M that does not happen before any modification of M that precedes A in S. -- end note ]

[ Note: We do not require that S be consistent with "happens before" (6.8.2.1 [intro.races]). This allows more efficient implementation of memory_order::acquire and memory_order::release on some machine architectures. It can produce surprising results when these are mixed with memory_order::seq_cst accesses. -- end note ]

But I fail to see how this definition provides the same guarantee as the C++17 paragraph above. Said paragraph essentially defines how seq-cst fences can ensure visibility of atomic operations. The C++20 definition states "(..) for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B (..)", so it requires A and B to be coherence-ordered. But according to the definition of coherence-ordered that requires B to observe the value written by A, i.e., instead of defining how changes can be made visible we are making visibility a precondition. Am I missing something?

Nate Eldredge · Accepted Answer · 2025-03-03 15:20:27Z

3

Under the C++17 rules, you proved that if F1 precedes F2 in S, then a.load() in thread 2 observes the value 1 from a.store() in thread 1, or a value from a modification later in the modification order.

Under the C++20 rules, we can prove the same statement by contrapositive.

Suppose a.load() in thread 2 (call it A) does not read the value 1 from a.store() in thread 1 (call it B), nor any later modification. Since A must observe the value of some modification, and the modification order is a total order, it must read the value from some modification X which is earlier than B. Thus by the third bullet in the definition of coherence-order, we have that A is coherence-ordered before B.

Now we can apply the fourth bullet of the requirements on S, taking X = F2 and Y = F1. We have that F2 happens before A and B happens before F1, so we conclude that F2 precedes F1 in S. Since S is a total order, this means that F1 does not precede F2.

A similar argument could prove (again by contrapositive) that if F2 precedes F1 then b.load() in thread 1 observes the value stored by b.store() in thread 2.

answered Mar 3 at 15:20

Nate Eldredge

62.4k7 gold badges78 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

mpoeter Mar 7 at 9:06

Thanks! I get your point, but for a complete argument we probably have to go a bit further. If the load does not observe the store, we can conclude that the load is coherence-ordered before the store, and in combination with the fences we can thus conclude a specific order of the fences. With respect to the original question, if neither load observes the store, this would mean that F1 is ordered before F2 as well as F2 is ordered before F1 (both in S), which is obviously a contradiction since S is a total order. So at least one of the loads must observe a store. Does that sound right?

mpoeter Mar 7 at 9:12

I probably would not have reached that conclusion by myself, partly because I did not never even considered the third bullet of the coherence-order definition because I stopped reading after "read-modify-write operation", disregarding it as not relevant for my case. IMHO the old paragraph was much easier to grasp, so a clarifying note in the standard would have been nice (there are already a few, why not add another one).

Nate Eldredge Mar 14 at 15:15

@mpoeter: If your goal is to prove "at least one load observes a store", your proof by contradiction works, but I think it's also implicit in what I wrote. I proved that if F1 precedes F2 in S, then the a store is observed; and if F2 precedes F1, then the b store is observed. Since S is a total order, exactly one of those two cases holds.

Collectives™ on Stack Overflow

Visibility of atomic operations with seq-cst fences in C++20

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related