Here I have two functions:
from functools import reduce
from operator import mul
def fn1(arr):
product = 1
for number in arr:
product *= number
return [product//x for x in arr]
def fn2(arr):
product = reduce(mul, arr)
return [product//x for x in arr]
Benchmark:
In [2]: arr = list(range(1,11))
In [3]: %timeit fn1(arr)
1.62 µs ± 23.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [4]: %timeit fn2(arr)
1.88 µs ± 28.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: arr = list(range(1,101))
In [6]: %timeit fn1(arr)
38.5 µs ± 190 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [7]: %timeit fn2(arr)
41 µs ± 463 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [8]: arr = list(range(1,1001))
In [9]: %timeit fn1(arr)
4.23 ms ± 25.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [10]: %timeit fn2(arr)
4.24 ms ± 36.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [11]: arr = list(range(1,10001))
In [12]: %timeit fn1(arr)
605 ms ± 4.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [13]: %timeit fn2(arr)
594 ms ± 4.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Here fn2() is marginally slower with the small lists. My understanding was that reduce() and mul() functions are both builtin functions, therefore they run at C speed and should be faster than the for loop. Probably because I have much more function calls (which also take some time) inside the fn2, it contributes to the end performance? But then the trend shows that fn2() outperforms fn1() with the larger lists. Why?
reduceis there just in case you prefer programming in a more functional style, it's not supposed to be an optimization. Also, you have a list comprehension in the functions that is probably taking most of the time anyway, especially considering you are dividing numbers with literally thousands of digits (10,000! takes over 100,000 bits to store).mul. (It seems that the standard CPython interpreter doesn't do inlining. stackoverflow.com/questions/6442050/…, note how PyPy gets a special mention for its inlining support). You shouldn't worry too much about this; if performance on that scale matters, Python is probably the wrong tool to begin with. (Though you could try PyPy.)forloop + arithmetic ops (fn1) take less time to compute than thereduce()+mul()(fn2), then it should remain true regardless the size of the list, right? Furthermore with more items in the list computational distance betweenfn1andfn2should increase, as with more function calls it would require even more time. But I'm observing the opposite, it seems that with more items in the listfn2magically starts performing better thanfn1.