Newest 'memory-barriers' Questions

4 votes

0 answers

176 views

Are writes (atomic or not) in a third thread visible due to a "happens before" relationship between two other threads?

As the title states, I have a question about the visibility of memory operation guarantees with happens before order. I've read the cppref memory order and a previous C++ iso draft, but I still cannot ...

Haiyang He

41

asked Oct 31 at 3:36

1 vote

1 answer

189 views

Is sequentially consistent memory ordering strictly necessary in this readers-writers lock using only load/store, not RMW?

Consider this outline of a simple multi-threaded application. It has one writer thread, and ten reader threads. #include <atomic> #include <thread> const int Num_readers{...

WaltK

802

asked Oct 1 at 0:00

1 vote

0 answers

88 views

std::thread::join call reordering [duplicate]

I have a thread pool implementation, which stops all threads by setting a special atomic bool stop to true. Its destructor looks like this: ThreadPool::~ThreadPool() { stop.store(true); for (...

Artegful

11

asked Sep 2 at 6:23

2 votes

0 answers

116 views

Is it possible on any real hardware, for the updated value of an atomic integer to become visible earlier via an indirect path than via a direct path?

Is it possible on any real hardware in the real world, for the updated value of an atomic integer written by one thread to become visible to another thread earlier via an indirect path, where a third ...

Qwert Yuiop

362

asked Aug 19 at 21:10

1 vote

1 answer

58 views

How to simulate a weakly order memory environment on a host with strong memory order?

I want to test if there are any bugs in my program where the memory sequence has not been properly used, but I do not have a weakly order memory environment for testing. For example, on x86, all loads ...

untitled

563

asked Aug 17 at 14:14

2 votes

1 answer

216 views

Can the hardware reorder an atomic load followed by an atomic store, if the store is conditional on the load?

Can the hardware reorder an atomic load followed by an atomic store, if the store is conditional on the load? It would be highly unintuitive if this could happen, because if thread1 speculatively due ...

Qwert Yuiop

362

asked Aug 15 at 20:54

0 votes

2 answers

218 views

Why is an acquire barrier cannot stop a reordering around a branch?

I was testing the behavior of the control dependencies in LINUX KERNEL MEMORY BARRIERS, and had a problem with the location of the fence. I was testing this on AArch64 on a Qualcomm Snapdragon 835, ...

Kymdon

13

asked Aug 7 at 5:22

7 votes

0 answers

268 views

C++ memory order relaxed for a SeqLock producer and consumer algorithm

below is a copy screen from https://www.youtube.com/watch?v=5uIsadq-nyk, at 1:12:23, the line std::size_t seq2=seq.load(std::memory_order_relaxed) baffles me. I am not very familiar with the atomics, ...

Michael

810

asked Jul 6 at 10:27

0 votes

1 answer

80 views

virtio: USE VIRTIO_F_EVENT_IDX feature may loss notify

I want to check if host notifications might be lost in the following scenarios. The hypothesis is shown in the figure below. However, I am not sure whether the commit actions of the last two CPUs ...

wang fuqiang

81

asked Jun 6 at 12:07

3 votes

0 answers

163 views

In the Independent Read Independent Write (IRIW) scenario, is changing loads to seq_cst alone sufficient to prevent the result in C++23?

In C++23, consider the classic IRIW litmus test, with the modification that all loads are now seq_cst, while stores are still relaxed: void reader0(atomic_int *x, atomic_int *y) { int l0x = x->...

Liu Xiaoyi

31

asked May 14 at 18:51

2 votes

1 answer

168 views

What Store/Store reordering do modern CPUs do in practice?

Aarch64 and RISC-V WMO seem to allow Store/Store reordering according to their formal specifications. However, Store/Store reordering seems very tricky to perform in practice: the CPU would need to ...

64_

579

asked Apr 27 at 3:05

1 vote

0 answers

104 views

RISC-V instruction equivalent to ARM's DSB execution barrier instruction for benchmarking to time loads?

I am writing a RISC-V assembly program whose goal is to assess the performance of main memory, in read access only for now. I have thought about a simple benchmark code, that would load multiple ...

SFV

11

asked Apr 9 at 14:50

4 votes

1 answer

103 views

GCC wiki memory barrier example

The following code comes from the GCC Wiki. // -Thread 1- y.store (20, memory_order_relaxed) x.store (10, memory_order_relaxed) // -Thread 2- if (x.load (memory_order_relaxed) == 10) { assert (y....

hk134579

43

asked Apr 2 at 3:03

4 votes

1 answer

163 views

Does C++ `memory_order_seq_cst` guarantee load of other variables in the current thread?

I'm struggling to understand the standardese regarding memory_order_seq_cst guarantees and I'm not able to find an appropriate example. The code below demonstrates my situation. #include <atomic>...

Jordan Woyak

65

asked Mar 29 at 22:51

4 votes

1 answer

93 views

The sequential consistent order of C++11 vs traditional GCC built-ins like `__sync_synchronize`

So I've came across Jeff Preshing's wonderful blog posts on what's Acquire/Release and how they may be achieved with some CPU barriers. I've also read that SeqCst is about some total order that's ...

Not A Name

63

asked Mar 23 at 17:23

6 votes

2 answers

266 views

Does a syscall automatically imply a memory barrier/read values sequentially consistent (specifically futex)?

In C++, I have two threads. Each thread does a store first on one variable, then a load on another variable, but in reversed order: std::atomic<bool> please_wake_me_up{false}; uint32_t cnt{0}; ...

sedor

326

asked Mar 9 at 17:48

4 votes

1 answer

119 views

Visibility of atomic operations with seq-cst fences in C++20

Until C++17 the standard contained the following paragraph (C++17 Section 32.4 [atomics.order] paragraph 6): For atomic operations A and B on an atomic object M, where A modifies M and B takes its ...

mpoeter

3,041

asked Mar 3 at 14:41

2 votes

1 answer

149 views

Array based Lock-Free stack. Is full fence necessary?

I'm new to lock-free algorithms and trying to implement Stack which is the simplest lock-free data structure. Here is my implementation of bounded array-based lock-free stack. public class ...

Some Name

9,730

asked Feb 15 at 16:57

0 votes

1 answer

114 views

Which memory barriers do I need, to make the writes to image in thread A visible in Thread B?

Where do I need to put memory barriers? The writes to image in thread A should be visible in thread B? The spots are marked in the pseudo code example and are derived from this question/answer. There ...

knivil

857

asked Jan 29 at 16:09

1 vote

0 answers

132 views

What is Memory Ordering Nuke in Intel CPUs?

I found this term in https://rcs.uwaterloo.ca/~ali/cs854-f23/papers/topdown.pdf For example, incorrect data speculation generated Memory Ordering Nukes [7] - a subset of Machine Clears. What is it ...

k1r1t0

867

asked Jan 14 at 12:27

2 votes

1 answer

91 views

Are acquire-release semantics transitive across threads? [duplicate]

I recently encountered two seemingly opposing explanations on the transitivity of acquire-release semantics. The section "Transitive Synchronization with Acquire-Release Ordering" on pg 160 ...

ron burgundy

193

asked Dec 31, 2024 at 9:36

1 vote

0 answers

42 views

Why Did LOCK-prefixed Instructions Become Preferred Over MFENCE for Memory Fences in the JVM on x86? [duplicate]

Context This question explores the semantics of the MFENCE full memory barrier instruction in the x86 instruction set, particularly in comparison with modifying instructions that include the LOCK ...

Dmytro Kostenko

245

asked Dec 26, 2024 at 17:15

1 vote

0 answers

143 views

How Does the Store Buffer Drain in x86 Architecture Work?

The topic of the Store Buffer (SB) and its mechanics, size, purpose, and interaction with other buffers has been discussed on Stack Overflow several times. However, certain aspects of its operation ...

Dmytro Kostenko

245

asked Dec 25, 2024 at 15:21

1 vote

2 answers

191 views

C++ memory order on Apple M1 chip not work: reordering happens even with seq_cst in a StoreStore / LoadLoad litmus test?

With the seq_cst memory order, the following code should never have v1 == 0 and v2 == 2 . But it still just prints Reorder happened on my Apple M1 chip. I really don't know why. #include <semaphore....

Ryaaan

25

asked Dec 12, 2024 at 2:12

-1 votes

1 answer

81 views

Why different threads can see different memory operation orders? [duplicate]

The following code is an example from the book C++ Concurrency in Action (2nd edition). The author mentions that threads Ta and Tb can observe different memory states: Tc observes x == true and y == ...

hao

1

asked Dec 9, 2024 at 6:34

Collectives™ on Stack Overflow

Are writes (atomic or not) in a third thread visible due to a "happens before" relationship between two other threads?

Is sequentially consistent memory ordering strictly necessary in this readers-writers lock using only load/store, not RMW?

std::thread::join call reordering [duplicate]

Is it possible on any real hardware, for the updated value of an atomic integer to become visible earlier via an indirect path than via a direct path?

How to simulate a weakly order memory environment on a host with strong memory order?

Can the hardware reorder an atomic load followed by an atomic store, if the store is conditional on the load?

Why is an acquire barrier cannot stop a reordering around a branch?

C++ memory order relaxed for a SeqLock producer and consumer algorithm

virtio: USE VIRTIO_F_EVENT_IDX feature may loss notify

In the Independent Read Independent Write (IRIW) scenario, is changing loads to seq_cst alone sufficient to prevent the result in C++23?

What Store/Store reordering do modern CPUs do in practice?

RISC-V instruction equivalent to ARM's DSB execution barrier instruction for benchmarking to time loads?

GCC wiki memory barrier example

Does C++ `memory_order_seq_cst` guarantee load of other variables in the current thread?

The sequential consistent order of C++11 vs traditional GCC built-ins like `__sync_synchronize`

Does a syscall automatically imply a memory barrier/read values sequentially consistent (specifically futex)?

Visibility of atomic operations with seq-cst fences in C++20

Array based Lock-Free stack. Is full fence necessary?

Which memory barriers do I need, to make the writes to image in thread A visible in Thread B?

What is Memory Ordering Nuke in Intel CPUs?

Are acquire-release semantics transitive across threads? [duplicate]

Why Did LOCK-prefixed Instructions Become Preferred Over MFENCE for Memory Fences in the JVM on x86? [duplicate]

How Does the Store Buffer Drain in x86 Architecture Work?

C++ memory order on Apple M1 chip not work: reordering happens even with seq_cst in a StoreStore / LoadLoad litmus test?

Why different threads can see different memory operation orders? [duplicate]

Hot Network Questions