Skip to main content
Filter by
Sorted by
Tagged with
18 votes
2 answers
4k views

I have read many articles on memory ordering, and all of them only say that a CPU reorders loads and stores. Does a CPU (I'm specifically interested in an x86 CPU) only reorders loads and stores, and ...
James's user avatar
  • 803
30 votes
3 answers
5k views

Reading Joseph Albahari's threading tutorial, the following are mentioned as generators of memory barriers: C#'s lock statement (Monitor.Enter/Monitor.Exit) All methods on the Interlocked class ...
Ohad Schneider's user avatar
18 votes
2 answers
4k views

This question is a follow-up/clarification to this: Does the MOV x86 instruction implement a C++11 memory_order_release atomic store? This states the MOV assembly instruction is sufficient to ...
user997112's user avatar
  • 31.1k
39 votes
2 answers
26k views

8.1.2 Bus Locking Intel 64 and IA-32 processors provide a LOCK# signal that is asserted automatically during certain critical memory operations to lock the system bus or equivalent link. While this ...
Gilgamesz's user avatar
  • 5,173
24 votes
4 answers
20k views

I read the "Intel Optimization guide Guide For Intel Architecture". However, I still have no idea about when should I use _mm_sfence() _mm_lfence() _mm_mfence() Could anyone explain when these ...
prgbenz's user avatar
  • 1,199
20 votes
3 answers
5k views

As we know from a previous answer to Does it make any sense instruction LFENCE in processors x86/x86_64? that we can not use SFENCE instead of MFENCE for Sequential Consistency. An answer there ...
Alex's user avatar
  • 13.3k
5 votes
2 answers
4k views

As far as I know, a function call acts as a compiler barrier, but not as a CPU barrier. This tutorial says the following: acquiring a lock implies acquire semantics, while releasing a lock ...
user8426277's user avatar
58 votes
6 answers
25k views

In "C# 4 in a Nutshell", the author shows that this class can write 0 sometimes without MemoryBarrier, though I can't reproduce in my Core2Duo: public class Foo { int _answer; bool _complete; ...
Felipe Pessoto's user avatar
31 votes
5 answers
11k views

The Linux kernel uses lock; addl $0,0(%%esp) as write barrier, while the RE2 library uses xchgl (%0),%0 as write barrier. What's the difference and which is better? Does x86 also require read barrier ...
Hongli's user avatar
  • 19k
15 votes
1 answer
1k views

I have been trying to Google my question but I honestly don't know how to succinctly state the question. Suppose I have two threads in a multi-core Intel system. These threads are running on the ...
Cube Fan's user avatar
  • 153
61 votes
2 answers
24k views

Often in internet I find that LFENCE makes no sense in processors x86, ie it does nothing , so instead MFENCE we can absolutely painless to use SFENCE, because MFENCE = SFENCE + LFENCE = SFENCE + NOP =...
Alex's user avatar
  • 13.3k
73 votes
2 answers
32k views

Some languages provide a volatile modifier that is described as performing a "read memory barrier" prior to reading the memory that backs a variable. A read memory barrier is commonly described as a ...
Jason Kresowaty's user avatar
7 votes
1 answer
2k views

I have been reading Memory Barriers: A Hardware View For Software Hackers, a very popular article by Paul E. McKenney. One of the things the paper highlights is that, very weakly ordered processors ...
KodeWarrior's user avatar
  • 3,618
32 votes
4 answers
4k views

I am currently reading C++ Concurrency in Action by Anthony Williams. One of his listing shows this code, and he states that the assertion that z != 0 can fire. #include <atomic> #include <...
Aryan's user avatar
  • 638
156 votes
5 answers
70k views

What is meant by using an explicit memory fence?
yesraaj's user avatar
  • 48.3k
29 votes
2 answers
13k views

I'm a newbie when it comes to this. Could anyone provide a simplified explanation of the differences between the following memory barriers? The windows MemoryBarrier(); The fence _mm_mfence(); The ...
AJG85's user avatar
  • 16.3k
81 votes
1 answer
43k views

Ok, I have been reading the following Qs from SO regarding x86 CPU fences (LFENCE, SFENCE and MFENCE): Does it make any sense instruction LFENCE in processors x86/x86_64? What is the impact SFENCE and ...
user997112's user avatar
  • 31.1k
26 votes
5 answers
16k views

I'm writing a multithreaded application in c++, where performance is critical. I need to use a lot of locking while copying small structures between threads, for this I have chosen to use spinlocks. ...
sigvardsen's user avatar
  • 1,541
2 votes
1 answer
434 views

I am writting this post in connection to Deep understanding of volatile in Java public class Main { private int x; private volatile int g; public void actor1(){ x = 1; g = 1;...
Gilgamesz's user avatar
  • 5,173
19 votes
2 answers
8k views

I read recently about memory barriers and the reordering issue and now I have some confusion about it. Consider the following scenario: private object _object1 = null; private object _object2 = ...
Jalal Said's user avatar
  • 16.2k
8 votes
2 answers
5k views

x86 guarantees a total order over all stores due to its TSO memory model. My question is if anyone has an idea how this is actually implemented. I have a good impression how all the 4 fences are ...
pveentjer's user avatar
  • 11.6k
53 votes
7 answers
13k views

A coworker and I write software for a variety of platforms running on x86, x64, Itanium, PowerPC, and other 10 year old server CPUs. We just had a discussion about whether mutex functions such as ...
David's user avatar
  • 1,033
30 votes
2 answers
7k views

I tried looking for details on this, I even read the standard on mutexes and atomics... but still I couldnt understand the C++11 memory model visibility guarantees. From what I understand the very ...
NoSenseEtAl's user avatar
  • 30.9k
13 votes
3 answers
3k views

So I researched the topic for quite some time now, and I think I understand the most important concepts like the release and acquire memory fences. However, I haven't found a satisfactory explanation ...
domin's user avatar
  • 1,384
12 votes
2 answers
544 views

Using a simplified version of a basic seqlock , gcc reorders a nonatomic load up across an atomic load(memory_order_seq_cst) when compiling the code with -O3. This reordering isn't observed when ...
Alejandro's user avatar
  • 3,082

1
2 3 4 5 6