What algorithms are used in C++11 std::sort in different STL implementations?

Question

The C++11 standard guarantees that std::sort has O(n logn) complexity in the worst case. This is different from the average-case guarantee in C++98/03, where std::sort could be implemented with Quicksort (maybe combined with insertion sort for small n), which has O(n^2) in the worst case (for some specific input, such as sorted input).

Were there any changes in std::sort implementations in different STL libraries? How is C++11's std::sort implemented in different STLs?

Not a proper answer because I haven't conducted a full survey, but basically everyone was already using Introsort anyway, and that's why the stronger requirement is in practice not a problem for implementers. And even before Introsort, nobody with any sense used Quicksort with the leftmost element as the pivot. That was a simplification for the purposes of demonstrating the algorithm, although of course it does actually work fine for random data too. So in practice the worst case of Quicksort is not sorted input. — Steve Jessop
– Steve Jessop, Commented Mar 12, 2014 at 0:02
Yeah hybrid quicksorts (like introsort) are what pretty much everyone uses. I guess you could also find merge sort hybrids like timsort. — R. Martinho Fernandes
– R. Martinho Fernandes, Commented Mar 12, 2014 at 12:11
If you are trying to evaluate sorting methods, check on the number of constructors created during the sort. If the default constructor is inefficient, it will be slow, no matter how fast the sort is. — cup
– cup, Commented Mar 15, 2014 at 7:37
@bamboon it is not a duplicate because attracts attention to the strengthening requirements to the std::sort in the C++11 standard. And really the question is about: do C++11 std::sort implementations match C++11 standard O(n logn) worst case requirement? ( In mentioned previous answer it is said about O(n^2) worst case complexity of std::sort which is now should not be possible. ) — Spock77
– Spock77, Commented Mar 16, 2014 at 0:07
@cup You meant the copy constructor (assuming there is no move constructor and no fast specialization of std::swap), I assume? I don’t see why the default constructor should be invoked at all. — Christopher Creutzig
– Christopher Creutzig, Commented Mar 17, 2014 at 7:00

Cahit Gungor · Accepted Answer · 2014-03-20 13:04:57Z

26

The question is, how can STL say std::sort worst case is O(N log(N)), even though it is in essence a QuickSort. STL's sort is IntroSort. IntroSort is in essence a QuickSort, the difference introduced change the worst case complexity.

QuickSort worst case is O(N^2)

What ever partitioning you choose, there exist a sequence that QuickSort will run on O(N^2). The partitioning you choose only decreases the probability of the worst case to occur. (Random Pivot Selection , Median-Of-Three, etc.)

EDIT: Thanks to @maxim1000 s correction. Quicksort with pivot selection algorithm Median of Medians has O(N log(N)) worst case complexity, but due to the overhead it introduces it isn't used in practice. It shows how good selection algorithm, can change the worst-case complexity through pivot selection, theoretically.

What does IntroSort do?

IntroSort limits the branching of QuickSort. This is the most important point, that limit is 2 * (log N). When limit is reached, IntroSort can use any sorting algorithm that has worst case complexity of O(N log(N)).

Branching stops when we have O(log N) subproblems. We can solve every subproblem O(n log n). (Lower case n stands for the subproblem sizes).

Sum of (n log n) is our worst case complexity, now.

For the worst case of QuickSort; assume we have an already sorted array, and we select always the first element in this array as the pivot. In every iteration we get rid of only the first element. If we went this way until the end, it would be O(N^2) obviously. With IntroSort we stop QuickSort, when we reach a depth log(N), then we use HeapSort for the remaining unsorted array.

16 -> 1  /**N**/
   \
    > 15 -> 1 /**N - 1**/
         \
          > 14 -> 1 /**N - 2**/
               \
                > 13 -> 1 /**N - log(N)**/  
                     \
                      > 12 /**(HeapSort Now) (N - log(N)) log (N - log(N))**/

Sum them up;

Until branching stops, N + (N - 1) + ... + (N - log(N)) operations done. Instead of using gauss to sum up, we can simply say N + (N - 1) + ... + (N - log(N)) < N log(N).

The HeapSort Part is (N - log(N)) log(N - log(N)) < N log(N)

Overall complexity < 2 N log(N).

Since the constants can be omitted, the worst case complexity of IntroSort is O(N log(N)).

Added Info: GCC STL implementation source code is here. Sort function is at line 5461.

Correction: *Microsoft .NET* sort Implementation is IntroSort since 2012. Related information is here.

edited Mar 20, 2014 at 13:04

answered Mar 17, 2014 at 0:21

Cahit Gungor

1,49723 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

14 Comments

Spock77 Over a year ago

The question was a little bit different - how std::sort is implemented in various STL implementations now - did they changed or not? How can you prove that STL's sort it introsort? introsort is well described in wikipedia, though

Cahit Gungor Over a year ago

GCC STL Source Code: gcc.gnu.org/onlinedocs/gcc-4.7.2/libstdc++/api/…. Start from line 5461 for stl::sort

maxim1000 Over a year ago

"Quicksort's worst-case performance is always O(n^2)" is incorrect. There is at least one algorithm for pivot selection which deterministically gives a good pivot - en.wikipedia.org/wiki/Median_of_medians. It's rarely used in practice due to constants, but it makes order of complexity for Quicksort O(n*log n).

Cahit Gungor Over a year ago

@maxim1000 thanks for the correction. I added the related information.

Spock77 Over a year ago

What I really wanted in the answer is the analysis of the STL source code from the people familiar with different implementations.

|

TemplateRex · Accepted Answer · 2014-07-09 21:20:46Z

22

Browsing the online sources for libstdc++ and libc++, one can see that both libraries use the full gamut of the well-known sorting algorithms from an intro-sort main loop:

For std::sort, there is a helper routine for insertion_sort (an O(N^2) algorithm but with a good scaling constant to make it competitive for small sequences), plus some special casing for sub-sequences of 0, 1, 2, and 3 elements.

For std::partial_sort, both libraries use a version of heap_sort (O(N log N) in general), because that method has a nice invariant that it keeps a sorted subsequence (it typically has a larger scaling constant to make it more expensive for full sorting).

For std::nth_element, there is a helper routine for selection_sort (again an O(N^2) algorithm with a good sclaing constant to make it competitive for small sequences). For regular sorting insertion_sort usually dominates selection_sort, but for nth_element the invariant of having the smallest elements perfectly matches the behavior of selection_sort.

edited Jul 9, 2014 at 21:20

answered Mar 12, 2014 at 15:45

TemplateRex

70.9k21 gold badges176 silver badges316 bronze badges

1 Comment

Spock77 Over a year ago

LLVM update danlark.org/2022/04/20/…

Collectives™ on Stack Overflow

What algorithms are used in C++11 std::sort in different STL implementations?

2 Answers 2

QuickSort worst case is O(N^2)

What does IntroSort do?

14 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

QuickSort worst case is O(N^2)

What does IntroSort do?

14 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related