I'd like to run several reduce functions on an iterable in Python (2.7). An example would be calling min and max on an iterable of integers. But of course you can't call reduce(min, it) and reduce(max, it) on the same iterable, since it's exhausted after the first call. So you might think to do something like:
reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in it))
And you think, Hey, that's pretty nifty, so you generalize it into something like this:
from itertools import izip
def multireduce(iterable, *funcs):
""":Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
return reduce(lambda a, b: tuple(func(aa, bb) for func, aa, bb in izip(funcs, a, b)), ((item,) * len(funcs) for item in iterable))
(And you like unit tests, so you include something like this:)
import unittest
class TestMultireduce(unittest.TestCase):
def test_multireduce(self):
vecs = (
((1,), (min,), (1,)),
(xrange(10), (min, max), (0, 9)),
(xrange(101), (min, max, lambda x, y: x + y,), (0, 100, (100 * 101) // 2))
)
for iterable, funcs, expected in vecs:
self.assertSequenceEqual(tuple(multireduce(iterable, *funcs)), expected)
But then you try it, and you realize it's really slow:
%timeit reduce(min, xrange(1000000)) ; reduce(max, xrange(1000000))
10 loops, best of 3: 140 ms per loop
%timeit reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in xrange(1000000)))
1 loop, best of 3: 682 ms per loop
%timeit multireduce(xrange(1000000), min, max)
1 loop, best of 3: 1.99 s per loop
Ouch. So then you come to Stack Overflow looking for Python optimization wisdom...