Faster for loop with only if in python [duplicate]

Question

I'm dealing with a big dataset and want to basically this:

test = np.random.rand(int(1e7))-0.5
def test0(test):
    return [0 if c<0 else c for c in test]

which is doing this:

def test1(test):
    for i,dat in enumerate(test):
        if dat<0: 
            test[i] = 0
        else:
            test[i] = dat
    return test

Is there a way to modify test0 to skip the else request so i works like this:

def test1(test):
    for i,dat in enumerate(test):
        if dat<0: test[i] = 0
    return test

Thanks in advance!

"which is doing this:" – not quite, test0 returns a new list, test1 modifies the input array. — mkrieger1
– mkrieger1, Commented Dec 22, 2021 at 10:30

Dariusz Krynicki · Accepted Answer · 2021-12-22 22:57:42Z

2

just do which seems to be fastest option for you:

(1) test[test < 0] = 0

(2) np.where(test < 0, 0, test) # THANKS TO @antony-hatchkins

(3) test.clip(0) # THANKS TO @u12-forward

depending on how you test it.

when you execute each method 1000 times then approach number 2 is fastest. when you measure single function execution then option number 1 is fastest.

test:

import numpy as np
import timeit
from copy import copy
from functools import partial


def create_data():
    return np.random.rand(int(1e7))-0.5


def func1(data):
    data[data < 0] = 0


def func2(data):
    np.putmask(data, data < 0, 0)


def func3(data):
    np.maximum(data, 0)


def func4(data):
    data.clip(0)


def func5(data):
    np.where(data < 0, 0, data)


if __name__ == '__main__':
    n_loops = 1000
    test = create_data()

    t1 = timeit.Timer(partial(func1, copy(test)))
    t2 = timeit.Timer(partial(func2, copy(test)))
    t3 = timeit.Timer(partial(func3, copy(test)))
    t4 = timeit.Timer(partial(func4, copy(test)))
    t5 = timeit.Timer(partial(func4, copy(test)))

    print(f"func1 (x[x < 0]): timeit {t1.timeit(n_loops)} num test loops {n_loops}")
    print(f"func2 (putmask): timeit {t2.timeit(n_loops)} num test loops {n_loops}")
    print(f"func3 (maximum): timeit {t3.timeit(n_loops)} num test loops {n_loops}")
    print(f"func4 (clip): timeit {t4.timeit(n_loops)} num test loops {n_loops}")
    print(f"func5 (where): timeit {t5.timeit(n_loops)} num test loops {n_loops}")

test results:

func1 (x[x < 0]): timeit 7.2177265440000005 num test loops 1000
func2 (putmask): timeit 13.913492435999999 num test loops 1000
func3 (maximum): timeit 23.065230873999997 num test loops 1000
func4 (clip): timeit 22.768682354000006 num test loops 1000
func5 (where): timeit 23.844607757999995 num test loops 1000

EDIT:

different approach to test data[data < 0] = 0 vs np.where(data < 0, 0, data):

import numpy as np
from time import perf_counter as clock


z = np.random.rand(10**7) - 0.5

start = clock()
for i in range(100):
    a = z.copy()
    np.where(a<0, 0, a)
print(clock() - start)


start = clock()
for i in range(100):
    a = z.copy()
    a[a<0] = 0
print(clock() - start)

test result:

7.9247566030000005
8.021165436000002

test3:

In [1]: import numpy as np
   ...: from copy import copy
   ...:
   ...:
   ...:
   ...: test = np.random.rand(int(1e7))-0.5
   ...:
   ...:
   ...: def func1():
   ...:     data = copy(test)
   ...:     data[data < 0] = 0
   ...:
   ...:
   ...: def func2():
   ...:     data = copy(test)
   ...:     np.putmask(data, data < 0, 0)
   ...:
   ...:
   ...: def func3():
   ...:     data = copy(test)
   ...:     np.maximum(data, 0)
   ...:
   ...:
   ...: def func4():
   ...:     data = copy(test)
   ...:     data.clip(0)
   ...:
   ...:
   ...: def func5():
   ...:     data = copy(test)
   ...:     np.where(data < 0, 0, data)
   ...:

In [2]: timeit func1
16.9 ns ± 0.117 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: timeit func2
15.8 ns ± 0.184 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [4]: timeit func3
22.1 ns ± 0.287 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [5]: timeit func4
15.6 ns ± 0.0594 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [6]: timeit func5
16.2 ns ± 0.187 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

edited Dec 22, 2021 at 22:57

answered Dec 22, 2021 at 10:34

Dariusz Krynicki

2,7283 gold badges30 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

16 Comments

Thesmot Over a year ago

Wow that is very fast, even faster than np.where(test<0,0,test). Thanks!

Thesmot Over a year ago

It''s actually a bit slower (19 vs 7.41 ms) for me.

Antony Hatchkins Over a year ago

I've added the clip to the comparison

Antony Hatchkins Over a year ago

Your benchmark is not correct because the operation is applied on the first invocation of the timeit function and after that you operate on an already non-negative array which is obviously faster and not the thing you're benchmarking against.

Thesmot Over a year ago

@AntonyHatchkins Good point actually! 64.5 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) 54.3 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 68.7 ms ± 665 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 68.5 ms ± 374 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 64.6 ms ± 1.57 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) These were my results (just copied the code above) ordered by func1 ... func5 Indeed on my hardware the putmask method is the fastest. I've learned a lot, thank you for all answers!

|

U13-Forward · Accepted Answer · 2021-12-22 10:40:45Z

2

Use np.ndarray.clip like test.clip(min=0):

>>> test.clip(0)
array([0.        , 0.11819274, 0.36379089, ..., 0.        , 0.13401746,
       0.        ])
>>>

Documentation of np.ndarray.clip:

Return an array whose values are limited to [min, max]. One of max or min must be given.

answered Dec 22, 2021 at 10:40

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

2 Comments

Antony Hatchkins Over a year ago

According to my benchmark clip is the slowest. Which is surprising because this is its primary use case and it looks as if it is a bug in numpy that it is so slow.

Dariusz Krynicki Over a year ago

I agree, clip is not fast on larger arrays.

Antony Hatchkins · Accepted Answer · 2021-12-23 07:22:01Z

2

You could try

np.maximum(test, 0)

But where is the fastest on my machine:

https://gist.github.com/axil/af6c4adb8c5634ff39ed9f3da1efaa90

Actually it depends on the amount of negative values in the array:

https://gist.github.com/axil/ce4ecdf1cb0446db47b979c37ed5fba3

Results:
– where is the fastest in most cases and is the only one with the flat curve
– putmask is #2
– where is only faster than the others when there's almost nothing to be done (≤10%)
– maximum and clip are (surprisingly) slower than the others in the whole range and obviously share the implementation.

The size of the array generally does not matter: https://gist.github.com/axil/2241e62977f46753caac7005268d5b28

edited Dec 23, 2021 at 7:22

answered Dec 22, 2021 at 10:28

Antony Hatchkins

34.4k13 gold badges121 silver badges114 bronze badges

9 Comments

Dariusz Krynicki Over a year ago

where is not fastest as it is still kind of if else but vectorized

Antony Hatchkins Over a year ago

@DariuszKrynicki Run this gist on your computer and see it yourself ;)

Dariusz Krynicki Over a year ago

your test is different. in each loop you execute each function once while I test performance of the function on many executions each.

Dariusz Krynicki Over a year ago

this is interesting. where is faster on small number of executions but slower on higher number of executions.

Antony Hatchkins Over a year ago

@DariuszKrynicki It can't be the case. Python is not numba. Every line of code has a predefined number of bytecode operations. No matter how many of them you execute you'll have the same speed. Modern CPUs have their own quirks and optimizations with branch predictions, but I'd argue we haven't gone that low-level in this task. If you insist on this point I can plot time vs number of repetitions but I don't expect any surprises there. You can try it yourself if you wish.

|

Collectives™ on Stack Overflow

Faster for loop with only if in python [duplicate]

3 Answers 3

16 Comments

2 Comments

9 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

16 Comments

2 Comments

9 Comments

Linked

Related