Trying to calculate algorithm time complexity

Question

So last night I solved this LeetCode question. My solution is not great, quite slow. So I'm trying to calculate the complexity of my algorithm to compare with standard algorithms that LeetCode lists in the Solution section. Here's my solution:

class Solution:
    def longestCommonPrefix(self, strs: List[str]) -> str:
        # Get lengths of all strings in the list and get the minimum
        # since common prefix can't be longer than the shortest string.
        # Catch ValueError if list is empty
        try:
            min_len = min(len(i) for i in strs)
        except ValueError:
            return ''

        # split strings into sets character-wise
        foo = [set(list(zip(*strs))[k]) for k in range(min_len)]

        # Now go through the resulting list and check whether resulting sets have length of 1
        # If true then add those characters to the prefix list. Break as soon as encounter
        # a set of length > 1.
        prefix = []
        for i in foo:
            if len(i) == 1:
                x, = i
                prefix.append(x)
            else:
                break
        common_prefix = ''.join(prefix)
        return common_prefix

I'm struggling a bit with calculating complexity. First step - getting minimum length of strings - takes O(n) where n is number of strings in the list. Then the last step is also easy - it should take O(m) where m is the length of the shortest string.

But the middle bit is confusing. set(list(zip(*strs))) should hopefully take O(m) again and then we do it n times so O(mn). But then overall complexity is O(mn + m + n) which seems way too low for how slow the solution is.

The other option is that the middle step is O(m^2*n), which makes a bit more sense. What is the proper way to calculate complexity here?

sidenote: I found that leetcode reported runtimes are likely affected by external factors. So don't stress if your results look bad because it could just be a slow server. — Razzle Shazl
– Razzle Shazl, Commented Feb 25, 2021 at 3:42
Another thing is that scanning across the 𝑘th character of many strings will have poor cache locality; that'll make the program run slower even though the complexity may be the same. — Jiří Baum
– Jiří Baum, Commented Feb 25, 2021 at 4:16

Razzle Shazl · Accepted Answer · 2021-02-25 05:02:54Z

2

Yes, the middle portion is O{mn}, as well the overall is O{mn} because that dwarfs the O{m} and O{n} terms for large values of m and n.

Your solution has an ideal order of runtime complexity.

Optimize: Short-Circuit

However, you are probably dismayed that others have faster solutions. I suspect that others likely short-circuit on the first non-matching index.

Let's consider a test case of 26 strings (['a'*500, 'b'*500, 'c'*500, ...]). Your solution would proceed to create a list that is 500 long, with each entry containing a set of 26 elements. Meanwhile, if you short-circuited, you would only process the first index, ie one set of 26 characters.

Try changing your list into a generator. This might be all you need to short-circuit.

foo = (set(x) for x in zip(*strs)))

You can skip min_len check because default behaviour of zip is to iterate only as long as the shortest input.

Optimize: Generating Intermediate Results

I see that you append each letter to a list, then ''.join(lst). This is efficient, especially compared to the alternative of iteratively appending to a string.

However, we could just as easily save a counter match_len. Then when we detect the first mis-match, just:

return strs[0][:match_len]

edited Feb 25, 2021 at 5:02

answered Feb 25, 2021 at 3:32

Razzle Shazl

1,3501 gold badge10 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Jiří Baum Over a year ago

Creating a set from a list comprehension will still evaluate the whole set.

Jiří Baum Over a year ago

Ah, I see what you mean; foo = (set(list(zip(*strs))[k]) for k in range(min_len)) would indeed construct it incrementally.

Jiří Baum Over a year ago

The code foo = (set(zip(*strs))) is equivalent to foo = set(zip(*strs)) and immediately constructs the full set.

Razzle Shazl Over a year ago

Thanks @sabik. I didn't realize that you needed a for within the outer-most parentheses to actually turn the outer-most parentheses into a generator. Have updated accordingly: foo = (set(x) for x in zip(*strs)))

NotAName Over a year ago

@RazzleShazl Brilliant! With this small change the solution is suddenly faster than 94.1% of all other solutions and the runtime has dropped from around 60ms to 28ms!

|

Collectives™ on Stack Overflow

Trying to calculate algorithm time complexity

1 Answer 1

Optimize: Short-Circuit

Optimize: Generating Intermediate Results

11 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Optimize: Short-Circuit

Optimize: Generating Intermediate Results

11 Comments

Your Answer

Sign up or log in

Post as a guest

Related