0

AFAIK part of the reason why we need the C++11 memory model (and later patches/variants) is the fact we trade various things for single threaded executions to be fast, with only one main criteria i.e. not to break single threaded executions; as for how to achieve this, it's completely up to the CPU/compiler vendors. Specifically one of the various tricks they are allowed to do is reorder writes: from PoV of other parts of the system, those writes may not happen strictly in the order of the source code.

My question is, in general when we're dealing with low level code, e.g. manipulating the interrupt enable/disable bit in mstatus/sstatus in RISC-V terms, do we need to have some sort of barriers around these instructions, too?

Specifically, suppose an OS kernel want to implement some generic spinlock. A classical approach for acquiring the spinlock is to loop RMW (while(!compare_exchange_weak_explicit(/* snip */)), etc) to set the lock, before which the OS should disable interrupts for avoiding deadlocks. The compare equal and success part of the RMW atomic operation is itself a store operation. So to ensure that we did disable interrupt before lock acquisition, maybe in general we should enforce some memory ordering here: maybe memory_order_acq_rel for the lock acquisition instead of the typical memory_order_acquire, since for compare_exchange_weak_explicit, memory_order_acquire upon success implies the store part is effectively memory_order_relaxed, or StoreStore memory barrier (in RISC-V terms, something like __asm__ volatile("fence w, w" : : :);)? Or maybe a compiler fence suffices here if RISC-V had already set some ordering constraints on CSR mstatus/sstatus manipulations? But I'm not sure that's the case... if that's the case I'd appreciate the source of information.

14
  • 1
    Your inline asm omits a "memory" clobber, allowing compile-time reordering with loads/stores. That makes it mostly useless as a memory barrier in high-level source code. Commented Apr 4 at 5:37
  • Ooops. Thx for pointing it out. I did had troubles figuring out what the clobber list directives mean, and apparently I didn't fully grasp this one. Commented Apr 4 at 5:44
  • 1
    RISC-V supports atomic RMWs without disabling interrupts, using LL/SC instructions. If you do disable interrupts as a way to get atomicity wrt. interrupts and context switches on this core (e.g. on a unicore system or for per-core OS variables), then disabling / re-enabling interrupts needs to be ordered wrt. to the memory accesses. Using volatile and inline asm to roll your own instead of using C++11, that means you need a "memory" clobber on the asm statement that enables or disables, but you don't need a separate fence instruction AFAIK. Commented Apr 4 at 14:19
  • godbolt.org/z/6zhrWzPTd shows a spin-loop using CAS (lr.w.aq and sc.w), and using exchange (single-instruction amoswap.w.aq). Commented Apr 4 at 14:24
  • 1
    You definitely need a compiler fence. If the compiler reorders an important memory access from one side of intr_on/intr_off to the other, you are dead. The machine architecture can't help you with that. Commented Apr 5 at 18:14

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.