C++ sort vectors time complexity

Question

Suppose I have a vector<vector<int>> L with N vectors, and total number of ints summed across all vectors is at most M. What is the tightest time complexity of the standard C++ sort sort(L.begin(), L.end())?

The vector<int> comparison function has runtime at most O(M), so an obvious bound is O(NM log N). But if we implement standard mergesort, we can see that in each of the O(log N) levels at most O(M) integer comparisons are done, so the runtime is O((N+M) log N). This is because comparing two vectors of length A and B takes O(min(A,B)) time.

Does the C++ standard guarantee that the runtime is O((N+M) log N)?

FYI: If you really care about complexity, consider using Radix Sort. Counting sorts have improved complexity over compare sorts. — Elliott
– Elliott, Commented Aug 6, 2021 at 13:24

Brian Rodriguez · Accepted Answer · 2017-01-17 16:13:55Z

3

There isn't enough information. You also need to know the distribution of the M values across the N vectors. When you have that, then it's straight forward to find the overall complexity:

std::sort has a complexity of O(N·log(N)) comparisons.
std::vector uses std::lexicographical_compare(v1, v2) for comparison, which has a complexity of O(min(v1.size(), v2.size())) comparisons.
int comparison has a complexity of O(1).
We'll let E(M, N) be a function on M, N that returns the mean number of minimum elements between every pair of inner vectors.
- For example, if you have a uniform distribution, this is trivially equal to M/N.
Take the product: Big Oh = N·log(N)·E(M, N)·1.
- For a uniform distribution, this would be M·log(N).

You can use Discrete Probability Distribution theory to figure out what the E(M, N) function is for any distribution of M across N.

Edit 1: To drive the point of how/why this matters: Consider a distribution that always makes my vectors look like:

outer[0].size() == 1,
outer[1].size() == 1,
outer[2].size() == 1,
...,
outer[M-1].size() == (M - N + 1)

In this case, E(M, N) = 1, because std::lexicographical_compare will only ever have one other element to compare to for any pair of elements. Thus, for this particular distribution, I will always have a complexity of O(N·log(N)). But with a uniform distribution, I'll have O(M·log(N)).

Edit 2: Following the comment where you define your distribution, let's try and find the E(M, N).

First, notice that there are in total T = (N choose 2) = N(N - 1)(1/2) different combinations of vector comparisons.

One (and only one) combination will take X = O((M - N + 2)(1/2)) comparisons, and has probability P(X) = 1/T to occur.

Every other combination will require just 1 comparison (O(1)), and so those cases occur with probability P(1) = (T - 1)/T.

Finding the mean is simple: X·P(X) + 1·P(1).

Given this, WolframAlpha says: E(M, N) = (M + (N - 2) N)/((N - 1) N).

Multiplying that function by N log(N) gives us (M + (N - 2) N) log(N) / (N - 1), which can be further simplified to the Big Oh you're looking for: O((M/N + N) log(N)).

edited Jan 17, 2017 at 16:13

answered Jan 12, 2017 at 22:33

Brian Rodriguez

4,3961 gold badge19 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Wakaka Over a year ago

Why do we use the mean comparison time for every pair of inner vectors? Shouldn't it be the mean comparison time for every pair that is compared by the C++ sort algorithm?

Wakaka Over a year ago

Thanks, I get that. I'm just wondering about this case: N-2 vectors of length 1, 2 vectors of length (M-N+2)/2. Clearly this should take very little time too. However the comparison time can be up to (M-N+2)/2. Does that mean the C++ sort will take (M-N+2)/2 * N log N time? I think we need to know what exactly are the comparisons done by the sort algorithm...

Brian Rodriguez Over a year ago

Okay well, again, this takes a bit of probability theory to understand. I'll try my best to explain. Please update your question w/ an edit giving the distribution while I type out the answer. :)

Brian Rodriguez Over a year ago

@Wakaka Hope that makes sense!

The Vee Over a year ago

Well, Landau notation can often be "prettified" intentionally so the last sentence does not make much sense. O((M+(N-2)N)/(N-1)*log(N)) "is" O((N + M/N) log(N)) and that in turn "is" O((M+N) log(N)) so the OP's guess was correct, just less specific.

|

MikeMB · Accepted Answer · 2017-01-13 11:50:30Z

2

In case your Integers are more or less random ¹⁾, most comparisons only need to compare the first few integers of each vector (until the first mismatch), so in practice / on average

M (counterintuitively) doesn't have any effect on the algorithmic complexity

To give you some Idea: Even, if your vectors have infinite length and the most frequently occurring integer has a probability p of 50%, you need less than 2 comparisons on average:

k < ∑ i*p^i = p/(1-p)^2 | p=0.5 
k < ∑ i*0.5^i = 2;

For other probabilities the results are:

60% -> k <  2.5
70% -> k <  3.4
80% -> k <  5.0
90% -> k < 10.0

Keep in mind that all those numbers are upper bounds for the average number of integer comparisons and independent of the number of elements in the vector

¹⁾ By random I don't mean random in a cryptographic sense. The numbers don't even have to pass most quality tests for random numbers. The only requirement is that they don't form the same prefix - which grows with the length of a vector - in a systematic manner.
Except for malicious input I can't currently think of a realistic example that would not qualify as "more or less random", but there is probably something else.

edited Jan 13, 2017 at 11:50

answered Jan 13, 2017 at 7:01

MikeMB

21.3k10 gold badges63 silver badges110 bronze badges

5 Comments

Brian Rodriguez Over a year ago

M matters here for the same reason N matters in find: some comparisons can end early, yes, but average case is still O(N/2) = O(N). When M >> N, M can become significant.

MikeMB Over a year ago

@Brian: No! Consider N == 2, M == 1000 (so two vectors of size 500) and let the integers be between 0 and 10. The probability that the last two integers are significant is 0.1^500. If you let M grow, the probability that the newly added integer makes a difference falls exponentially, while the length only growth linearly

Brian Rodriguez Over a year ago

That's true, but only because of the additional constraint you've placed on the integers. OP hasn't placed such a constraint, and when we remove it, M is significant even in that example.

MikeMB Over a year ago

@Brian: That is why I said "In case your Integers are more or less random". If you have to harden your algorithm against an attacker, that can choose arbitrary numbers (namely e.g. only 1) this of course doesn't help. But for most other applications you get a much more realistic estimate of how the algorithms will behave if you assume that the comparison terminates early. It is a bit like the situation with a hash table: Your worst case performance is O(n) and in some situations this is important but usually, you just assume that lookup is O(1)

Brian Rodriguez Over a year ago

Could you please be more descriptive in your answer on what it means for integers to be "more or less random", then?

Collectives™ on Stack Overflow

C++ sort vectors time complexity

2 Answers 2

8 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related