Looking in cppreference, it seems to imply a std::binary_semaphore may be more efficient than a std::mutex.
Is there any reason not to use a std::binary_semaphore initialized to 1 instead of a std::mutex?
Looking in cppreference, it seems to imply a std::binary_semaphore may be more efficient than a std::mutex.
Is there any reason not to use a std::binary_semaphore initialized to 1 instead of a std::mutex?
There are differences between a std::binary_semaphore and a std::mutex which are mentioned in the cppreference documentation (under the Notes section):
Unlike
std::mutexacounting_semaphoreis not tied to threads of execution - acquiring a semaphore can occur on a different thread than releasing the semaphore, for example. All operations oncounting_semaphorecan be performed concurrently and without any relation to specific threads of execution, with the exception of the destructor which cannot be performed concurrently but can be performed on a different thread.Semaphores are also often used for the semantics of signaling/notifying rather than mutual exclusion, by initializing the semaphore with 0 and thus blocking the receiver(s) that try to
acquire(), until the notifier "signals" by invokingrelease(n). In this respect semaphores can be considered alternatives tostd::condition_variables, often with better performance.
So I would say there are some reasons to use std::mutex:
In addition - as @MatthieuM. commented below, mutex implementations might be able to offer a performance advantage over a semaphore, e.g. - Futex implementation on Linux.
In summary:
These are two separate programming constructs.
They do have some potential functional overlap, but this is something which quite common (e.g. you can do with a struct everything you can do with a class. But they do have some differences and might be used in difference context).
This old post is from some years before C++20 was introduced (and std::binary_semaphore added), but it contains some additional relevant information.
A side note:
As @interjay commented above, cppreference does not mention the efficiency of std::binary_semaphore v.s. std::mutex, but rather that it may be more efficient than std::counting_semaphore:
Implementations may implement binary_semaphore more efficiently than the default implementation of std::counting_semaphore.
There is no reason to assume a std::binary_semaphore will be more efficient for implementing mutual exclusion than a std::mutex. As has been pointed out in the comments, cppreference merely hints at a potential performance difference between the counted and binary semaphore.
In general, mutex and semaphore target different use cases: A semaphore is for signalling, a mutex is for mutual exclusion. Mutual exclusion means you want to make sure that multiple threads cannot execute certain critical sections of code at the same time. std::mutex is the only synchronization facility in the standard library for this use case. The semaphore on the other hand targets the use case where one thread causes the program state to change and now wants to inform another thread of that change. There are multiple similar facilities in the standard library that target signalling use cases, for example std::condition_variable.
This is also the reason why semaphores do not require to be acquired and released on the same thread, unlike mutexes where unlocking a mutex that is held by another thread results in undefined behavior.
You should avoid using a signalling primitive for implementing mutual exclusion. Even though this can in theory be done, it is very easy to do it subtly wrong and likely to be less efficient than a dedicated mutual exclusion algorithm.
One big advantage that's not mentioned in the other answers is priority inheritance.
If a higher priority thread (for instance an interactive thread instead of a background one) tries to lock a mutex while a lower priority thread holds it, the lower priority thread will be temporarily upgraded so it can get its job done quick and release the lock to let the high priority thread run.
Since a semaphore can be "released" by any thread, there's no clear thread to increase the priority of and therefore priority inheritance does not work with them. So using a mutex instead of a semaphore is better for responsiveness too.
std::mutex implementations for macOS that make use of this? As far as I know, this was only ever used for Swift, but I could be wrong here. All of the implementations I am aware of came to the conclusion that optimizing fast locking by far outweighs the benefits of having a slower, priority-inversion-aware lock on non-real time systems. In addition to that, some people argue that priority inversion is a design bug and should not be addressed by the implementation. This is similar to the argument against using recursive mutexes to avoid self-locking deadlocks.For example, the Linux kernel documentation for generic mutexes says,
Mutexes are sleeping locks which behave similarly to binary semaphores, and were introduced in 2006 as an alternative to these. This new data structure provided a number of advantages, including simpler interfaces, and at that time smaller code (see Disadvantages).
To wit:
Disadvantages
Unlike its original design and purpose,
struct mutexis among the largest locks in the kernel. E.g: on x86-64 it is 32 bytes, wherestruct semaphoreis 24 bytes andrw_semaphoreis 40 bytes. Larger structure sizes mean more CPU cache and memory footprint.
std:: classes and cppreference documentation (despite the influence of Linux APIs on the C++ standard).std::mutex as a wrapper around a pthread_mutex_t. Th implementation of that uses a Linux kernel futex.There are a couple of hints in the cppreference documentation that std::binary_semaphore might be more efficient than std::mutex:
binary_semaphore is defined as "a lightweight synchronization primitive", whereas mutex is defined as "a synchronization primitive".
"semaphores can be considered alternatives to std::condition_variable, often with better performance."
Neither is definitive, and just because a semaphore is faster than a condition variable doesn't mean it's faster than a mutex used without a condition variable. So why is cppreference insinuating a difference?
In short, because it cannot know, but the author had some general expectations. There is AFAIK nothing in the C++ standard to require that a mutex must be less efficient than a semaphore, so the docs cannot rightly assert that a semaphore is more efficient than a mutex. If you need to know, test it.
I'm not sure how well the author's expectations hold up, but you'd need to look into it for a specific implementation:
If mutex and binary_semaphore are both implemented using the Linux futex system call on the implementation you're using, there won't be much (if anything) to choose between them.
In the old days I used a system (albeit not with a C++ compiler) where the primitive called mutex had all the bells and whistles (priority inheritance, waking the highest-priority waiter) whereas semaphore had no priority inheritance and for the sake of argument let's say it was FIFO although I don't actually remember. Then semaphore is more performant in terms of cycles to complete corresponding operations, but for many uses mutex would give better overall application performance because it runs the right threads at the right times.
You can implement a binary semaphore in terms of a mutex and condition variable. If your implementation did this, then mutex essentially would be guaranteed no slower than binary semaphore, and the performances suggestions made by cppreference would be reversed. I believe it would be a conforming implementation, and so cppreference cannot know that you're not using such a system.
I actually disagree with those who claim that a binary semaphore is utterly baffling when used for mutual exclusion. I mean, I can't refuse to believe them that they would be utterly baffled, but I think with proper naming you can work around that. Call the semaphore something_lock, and then if the method names acquire and release still prove troublesome, wrap it in a class to rename them lock and unlock , or better yet to define the serialised code sections with RAII just like std::unique_lock does.
Maybe this is just because I'm old, and so I remember when a semaphore with an initial count of 1 was a viable and idiomatic way to implement mutual exclusion, with different characteristics from mutex, and you were forced to choose between them. With "disable interrupts" as a third viable and idiomatic way to implement mutual exclusion, competing with both of them. I certainly accept that the name of mutex strongly suggests that it's the first place you should look for mutual exclusion. This is enough reason not to use binary_semaphore for that purpose unless:
you can measure a performance benefit, and
you are confident this performance benefit doesn't come from discarding fairness or other larger-scale properties that you need more than a few cycles saved.
Certainly you should not reason, "mutex is probably inefficient for its intended purpose"! The fear really is that the fact a mutex can be used with a condition variable might come with a performance cost even in the case you don't use it with a condition variable (a violation of "don't pay for what you don't use"). futex made that fear go away in about 2003, as far as Linux is concerned.
The cppreference documentation doesn't state that binary_semaphore is more efficient than mutex - it says it may be more efficient than a generic counting_semaphore<1>.
Differences:
Mutex: For mutual exclusion - same thread must lock and unlock
Semaphore: For signaling between threads - acquire and release can happen on different threads
Choose mutex when:
You need clear mutual exclusion semantics
You want RAII support (lock_guard, unique_lock)
You need thread ownership enforcement
Priority inheritance is important (prevents priority inversion)
Choose binary_semaphore when:
You need thread-to-thread signaling
Different threads need to acquire and release
You're implementing producer-consumer patterns
For protecting shared data, always use mutex. For thread signaling, use semaphore. Don't substitute based on assumed performance differences - correctness is more important than potential minor optimizations.