0

This statement is running quite slowly, and I have run out of ideas to optimize it. Could someone help me out?

[dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]

The small_lists contain only about 6 elements.

A really_huge_list_of_list of size 209,510 took approximately 16.5 seconds to finish executing.

Thank you!

Edit:

really_huge_list_of_list is a generator. Apologies for any confusion. The size is obtained from the result list.

3
  • 1
    using a generator is the best option, because large lists consume a large amount of memory Commented Jul 1, 2013 at 11:03
  • I would strongly suggest having a generator for you dictionary as well as for really_hugh_list_of_lists so that at any one time, when actually using the dictionary, you have a single dictionary of size 6, rather than a list of 209510 six entry dictionaries., I think that is what K DawG as suggesting. Commented Jul 1, 2013 at 11:25
  • Just to give some context, the list will be converted to JSON and passed to the client. Commented Jul 1, 2013 at 11:32

2 Answers 2

1

Possible minor improvement:

[dict(itertools.izip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]

Also, you may consider to use generator instead of list comprehensions.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your reply. However, since small_list2 is a small list, izip does not provide much of a advantage.
1

To expand on what the comments are trying to say, you should use a generator instead of that list comprehension. Your code currently looks like this:

[dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list]

and you should change it to this instead:

def my_generator(input_list_of_lists):
    small_list1 = ["wherever", "small_list1", "comes", "from"]
    for small_list2 in input_list_of_lists:
        yield dict(zip(small_list1, small_list2))

What you're doing right now is taking ALL the results of iterating over your really huge list, and building up a huge list of the results, before doing whatever you do with that list of results. Instead, you should turn that list comprehension into a generator so that you never have to build up a list of 200,000 results. It's building that result list that's taking up so much memory and time.

... Or better yet, just turn that list comprehension into a generator comprehension by changing its outer brackets into parentheses:

(dict(zip(small_list1, small_list2)) for small_list2 in really_huge_list_of_list)

That's really all you need to do. The syntax for list comprehensions and generator comprehensions is almost identical, on purpose: if you understand a list comprehension, you'll understand the corresponding generator comprehension. (In this case, I wrote out the generator in "long form" first so that you'd see what that comprehension expands to).

For more on generator comprehensions, see here, here and/or here.

Hope this helps you add another useful tool to your Python toolbox!

3 Comments

Thanks @rmunn for your detailed explanation. I will definitely try it out and report the performance. Correct me if I am wrong though, but if I have to return the generator in a JSON, won't I have to create the large list from the said generator anyway?
Hi @rmunn and the rest, so the execution time dropped to ~4e-6s, which is a huge improvement. However, I will still need to pass it into a JSON. The step of converting it to a list (as expected) takes ~ 15-16s. Is there another way to convert the generator object into something that's JSON serializable?
@benedictljj: gist.github.com/akaihola/1415730 looks like it might be exactly what you need. Note that it's based on the version of simplejson that's included with Django, and you haven't said that you're using Django. But if Django's version of simplejson is an unmodified copy of pypi.python.org/pypi/simplejson, then you should have no trouble using the code in that gist. (Disclaimer: I don't know anything about Django internals anymore. Used to, a little, but that was years ago.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.