3

I'm dealing with a big dataset and want to basically this:

test = np.random.rand(int(1e7))-0.5
def test0(test):
    return [0 if c<0 else c for c in test]

which is doing this:

def test1(test):
    for i,dat in enumerate(test):
        if dat<0: 
            test[i] = 0
        else:
            test[i] = dat
    return test

Is there a way to modify test0 to skip the else request so i works like this:

def test1(test):
    for i,dat in enumerate(test):
        if dat<0: test[i] = 0
    return test

Thanks in advance!

2
  • Or stackoverflow.com/questions/10335090/… Commented Dec 22, 2021 at 10:29
  • 1
    "which is doing this:" – not quite, test0 returns a new list, test1 modifies the input array. Commented Dec 22, 2021 at 10:30

3 Answers 3

2

just do which seems to be fastest option for you:

(1) test[test < 0] = 0

(2) np.where(test < 0, 0, test) # THANKS TO @antony-hatchkins

(3) test.clip(0) # THANKS TO @u12-forward

depending on how you test it.

when you execute each method 1000 times then approach number 2 is fastest. when you measure single function execution then option number 1 is fastest.

test:

import numpy as np
import timeit
from copy import copy
from functools import partial


def create_data():
    return np.random.rand(int(1e7))-0.5


def func1(data):
    data[data < 0] = 0


def func2(data):
    np.putmask(data, data < 0, 0)


def func3(data):
    np.maximum(data, 0)


def func4(data):
    data.clip(0)


def func5(data):
    np.where(data < 0, 0, data)


if __name__ == '__main__':
    n_loops = 1000
    test = create_data()

    t1 = timeit.Timer(partial(func1, copy(test)))
    t2 = timeit.Timer(partial(func2, copy(test)))
    t3 = timeit.Timer(partial(func3, copy(test)))
    t4 = timeit.Timer(partial(func4, copy(test)))
    t5 = timeit.Timer(partial(func4, copy(test)))

    print(f"func1 (x[x < 0]): timeit {t1.timeit(n_loops)} num test loops {n_loops}")
    print(f"func2 (putmask): timeit {t2.timeit(n_loops)} num test loops {n_loops}")
    print(f"func3 (maximum): timeit {t3.timeit(n_loops)} num test loops {n_loops}")
    print(f"func4 (clip): timeit {t4.timeit(n_loops)} num test loops {n_loops}")
    print(f"func5 (where): timeit {t5.timeit(n_loops)} num test loops {n_loops}")

test results:

func1 (x[x < 0]): timeit 7.2177265440000005 num test loops 1000
func2 (putmask): timeit 13.913492435999999 num test loops 1000
func3 (maximum): timeit 23.065230873999997 num test loops 1000
func4 (clip): timeit 22.768682354000006 num test loops 1000
func5 (where): timeit 23.844607757999995 num test loops 1000

EDIT:

different approach to test data[data < 0] = 0 vs np.where(data < 0, 0, data):

import numpy as np
from time import perf_counter as clock


z = np.random.rand(10**7) - 0.5

start = clock()
for i in range(100):
    a = z.copy()
    np.where(a<0, 0, a)
print(clock() - start)


start = clock()
for i in range(100):
    a = z.copy()
    a[a<0] = 0
print(clock() - start)

test result:

7.9247566030000005
8.021165436000002

test3:

In [1]: import numpy as np
   ...: from copy import copy
   ...:
   ...:
   ...:
   ...: test = np.random.rand(int(1e7))-0.5
   ...:
   ...:
   ...: def func1():
   ...:     data = copy(test)
   ...:     data[data < 0] = 0
   ...:
   ...:
   ...: def func2():
   ...:     data = copy(test)
   ...:     np.putmask(data, data < 0, 0)
   ...:
   ...:
   ...: def func3():
   ...:     data = copy(test)
   ...:     np.maximum(data, 0)
   ...:
   ...:
   ...: def func4():
   ...:     data = copy(test)
   ...:     data.clip(0)
   ...:
   ...:
   ...: def func5():
   ...:     data = copy(test)
   ...:     np.where(data < 0, 0, data)
   ...:

In [2]: timeit func1
16.9 ns ± 0.117 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: timeit func2
15.8 ns ± 0.184 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [4]: timeit func3
22.1 ns ± 0.287 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [5]: timeit func4
15.6 ns ± 0.0594 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [6]: timeit func5
16.2 ns ± 0.187 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)
Sign up to request clarification or add additional context in comments.

16 Comments

Wow that is very fast, even faster than np.where(test<0,0,test). Thanks!
It''s actually a bit slower (19 vs 7.41 ms) for me.
I've added the clip to the comparison
Your benchmark is not correct because the operation is applied on the first invocation of the timeit function and after that you operate on an already non-negative array which is obviously faster and not the thing you're benchmarking against.
@AntonyHatchkins Good point actually! 64.5 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) 54.3 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 68.7 ms ± 665 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 68.5 ms ± 374 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 64.6 ms ± 1.57 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) These were my results (just copied the code above) ordered by func1 ... func5 Indeed on my hardware the putmask method is the fastest. I've learned a lot, thank you for all answers!
|
2

Use np.ndarray.clip like test.clip(min=0):

>>> test.clip(0)
array([0.        , 0.11819274, 0.36379089, ..., 0.        , 0.13401746,
       0.        ])
>>> 

Documentation of np.ndarray.clip:

Return an array whose values are limited to [min, max]. One of max or min must be given.

2 Comments

According to my benchmark clip is the slowest. Which is surprising because this is its primary use case and it looks as if it is a bug in numpy that it is so slow.
I agree, clip is not fast on larger arrays.
2

You could try

np.maximum(test, 0)

But where is the fastest on my machine:

enter image description here

https://gist.github.com/axil/af6c4adb8c5634ff39ed9f3da1efaa90

Actually it depends on the amount of negative values in the array:

enter image description here

https://gist.github.com/axil/ce4ecdf1cb0446db47b979c37ed5fba3

Results:
    – where is the fastest in most cases and is the only one with the flat curve
    – putmask is #2
    – where is only faster than the others when there's almost nothing to be done (≤10%)
    – maximum and clip are (surprisingly) slower than the others in the whole range and obviously share the implementation.

The size of the array generally does not matter: enter image description here https://gist.github.com/axil/2241e62977f46753caac7005268d5b28

9 Comments

where is not fastest as it is still kind of if else but vectorized
@DariuszKrynicki Run this gist on your computer and see it yourself ;)
your test is different. in each loop you execute each function once while I test performance of the function on many executions each.
this is interesting. where is faster on small number of executions but slower on higher number of executions.
@DariuszKrynicki It can't be the case. Python is not numba. Every line of code has a predefined number of bytecode operations. No matter how many of them you execute you'll have the same speed. Modern CPUs have their own quirks and optimizations with branch predictions, but I'd argue we haven't gone that low-level in this task. If you insist on this point I can plot time vs number of repetitions but I don't expect any surprises there. You can try it yourself if you wish.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.