Efficient way to call multiple reduce functions on an iterable in Python?

Question

I'd like to run several reduce functions on an iterable in Python (2.7). An example would be calling min and max on an iterable of integers. But of course you can't call reduce(min, it) and reduce(max, it) on the same iterable, since it's exhausted after the first call. So you might think to do something like:

reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in it))

And you think, Hey, that's pretty nifty, so you generalize it into something like this:

from itertools import izip

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return reduce(lambda a, b: tuple(func(aa, bb) for func, aa, bb in izip(funcs, a, b)), ((item,) * len(funcs) for item in iterable))

(And you like unit tests, so you include something like this:)

import unittest
class TestMultireduce(unittest.TestCase):
    def test_multireduce(self):
        vecs = (
            ((1,), (min,), (1,)),
            (xrange(10), (min, max), (0, 9)),
            (xrange(101), (min, max, lambda x, y: x + y,), (0, 100, (100 * 101) // 2))
        )
        for iterable, funcs, expected in vecs:
            self.assertSequenceEqual(tuple(multireduce(iterable, *funcs)), expected)

But then you try it, and you realize it's really slow:

%timeit reduce(min, xrange(1000000)) ; reduce(max, xrange(1000000))
10 loops, best of 3: 140 ms per loop
%timeit reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in xrange(1000000)))
1 loop, best of 3: 682 ms per loop
%timeit multireduce(xrange(1000000), min, max)
1 loop, best of 3: 1.99 s per loop

Ouch. So then you come to Stack Overflow looking for Python optimization wisdom...

Doctor J · Accepted Answer · 2016-04-07 07:08:08Z

1

Well, there's this, which kind of defeats the point of iterables...

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return tuple(imap(reduce, funcs, tee(iterable, len(funcs))))

But it is quite fast for my test case:

%timeit multireduce(xrange(1000000), min, max)
10 loops, best of 3: 166 ms per loop

answered Apr 7, 2016 at 7:08

Doctor J

6,3726 gold badges48 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Efficient way to call multiple reduce functions on an iterable in Python?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related