0

Suppose I have a vector<vector<int>> L with N vectors, and total number of ints summed across all vectors is at most M. What is the tightest time complexity of the standard C++ sort sort(L.begin(), L.end())?

The vector<int> comparison function has runtime at most O(M), so an obvious bound is O(NM log N). But if we implement standard mergesort, we can see that in each of the O(log N) levels at most O(M) integer comparisons are done, so the runtime is O((N+M) log N). This is because comparing two vectors of length A and B takes O(min(A,B)) time.

Does the C++ standard guarantee that the runtime is O((N+M) log N)?

1
  • FYI: If you really care about complexity, consider using Radix Sort. Counting sorts have improved complexity over compare sorts. Commented Aug 6, 2021 at 13:24

2 Answers 2

3

There isn't enough information. You also need to know the distribution of the M values across the N vectors. When you have that, then it's straight forward to find the overall complexity:

  1. std::sort has a complexity of O(N·log(N)) comparisons.

  2. std::vector uses std::lexicographical_compare(v1, v2) for comparison, which has a complexity of O(min(v1.size(), v2.size())) comparisons.

  3. int comparison has a complexity of O(1).

  4. We'll let E(M, N) be a function on M, N that returns the mean number of minimum elements between every pair of inner vectors.

    • For example, if you have a uniform distribution, this is trivially equal to M/N.
  5. Take the product: Big Oh = N·log(N)·E(M, N)·1.
    • For a uniform distribution, this would be M·log(N).

You can use Discrete Probability Distribution theory to figure out what the E(M, N) function is for any distribution of M across N.


Edit 1: To drive the point of how/why this matters: Consider a distribution that always makes my vectors look like:

outer[0].size() == 1,
outer[1].size() == 1,
outer[2].size() == 1,
...,
outer[M-1].size() == (M - N + 1)

In this case, E(M, N) = 1, because std::lexicographical_compare will only ever have one other element to compare to for any pair of elements. Thus, for this particular distribution, I will always have a complexity of O(N·log(N)). But with a uniform distribution, I'll have O(M·log(N)).


Edit 2: Following the comment where you define your distribution, let's try and find the E(M, N).

First, notice that there are in total T = (N choose 2) = N(N - 1)(1/2) different combinations of vector comparisons.

One (and only one) combination will take X = O((M - N + 2)(1/2)) comparisons, and has probability P(X) = 1/T to occur.

Every other combination will require just 1 comparison (O(1)), and so those cases occur with probability P(1) = (T - 1)/T.

Finding the mean is simple: X·P(X) + 1·P(1).

Given this, WolframAlpha says: E(M, N) = (M + (N - 2) N)/((N - 1) N).

Multiplying that function by N log(N) gives us (M + (N - 2) N) log(N) / (N - 1), which can be further simplified to the Big Oh you're looking for: O((M/N + N) log(N)).

Sign up to request clarification or add additional context in comments.

8 Comments

Why do we use the mean comparison time for every pair of inner vectors? Shouldn't it be the mean comparison time for every pair that is compared by the C++ sort algorithm?
Thanks, I get that. I'm just wondering about this case: N-2 vectors of length 1, 2 vectors of length (M-N+2)/2. Clearly this should take very little time too. However the comparison time can be up to (M-N+2)/2. Does that mean the C++ sort will take (M-N+2)/2 * N log N time? I think we need to know what exactly are the comparisons done by the sort algorithm...
Okay well, again, this takes a bit of probability theory to understand. I'll try my best to explain. Please update your question w/ an edit giving the distribution while I type out the answer. :)
@Wakaka Hope that makes sense!
Well, Landau notation can often be "prettified" intentionally so the last sentence does not make much sense. O((M+(N-2)N)/(N-1)*log(N)) "is" O((N + M/N) log(N)) and that in turn "is" O((M+N) log(N)) so the OP's guess was correct, just less specific.
|
2

In case your Integers are more or less random 1), most comparisons only need to compare the first few integers of each vector (until the first mismatch), so in practice / on average

M (counterintuitively) doesn't have any effect on the algorithmic complexity

To give you some Idea: Even, if your vectors have infinite length and the most frequently occurring integer has a probability p of 50%, you need less than 2 comparisons on average:

k < ∑ i*p^i = p/(1-p)^2 | p=0.5 
k < ∑ i*0.5^i = 2;

For other probabilities the results are:

60% -> k <  2.5
70% -> k <  3.4
80% -> k <  5.0
90% -> k < 10.0

Keep in mind that all those numbers are upper bounds for the average number of integer comparisons and independent of the number of elements in the vector

1) By random I don't mean random in a cryptographic sense. The numbers don't even have to pass most quality tests for random numbers. The only requirement is that they don't form the same prefix - which grows with the length of a vector - in a systematic manner.
Except for malicious input I can't currently think of a realistic example that would not qualify as "more or less random", but there is probably something else.

5 Comments

M matters here for the same reason N matters in find: some comparisons can end early, yes, but average case is still O(N/2) = O(N). When M >> N, M can become significant.
@Brian: No! Consider N == 2, M == 1000 (so two vectors of size 500) and let the integers be between 0 and 10. The probability that the last two integers are significant is 0.1^500. If you let M grow, the probability that the newly added integer makes a difference falls exponentially, while the length only growth linearly
That's true, but only because of the additional constraint you've placed on the integers. OP hasn't placed such a constraint, and when we remove it, M is significant even in that example.
@Brian: That is why I said "In case your Integers are more or less random". If you have to harden your algorithm against an attacker, that can choose arbitrary numbers (namely e.g. only 1) this of course doesn't help. But for most other applications you get a much more realistic estimate of how the algorithms will behave if you assume that the comparison terminates early. It is a bit like the situation with a hash table: Your worst case performance is O(n) and in some situations this is important but usually, you just assume that lookup is O(1)
Could you please be more descriptive in your answer on what it means for integers to be "more or less random", then?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.