1

Suppose I have a code like:

import numpy as np

def value_error(x):
    if x > 10:
        return 0.
    else:
        return np.sin(x)

This could give me a ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() if called upon an numpy array.

Now I could do this instead:

def alright(x):
    return np.sin(x) * (x <= 10.)

print alright(np.ones(100) * 100)
print value_error(np.ones(100) * 10)

My function (in this case np.sin) could be an expensive one. It is, however, called for every element of x, even ones where I know the answer because x > 10, without an expensive call. How can I get the best of both worlds?

3 Answers 3

4

Many of the ufunc take a where parameter

In [98]: x=np.arange(10)*2
In [99]: mask = x<10
In [100]: y = np.zeros(10)
In [101]: np.sin(x,where=mask,out=y)
Out[101]: 
array([ 0.        ,  0.90929743, -0.7568025 , -0.2794155 ,  0.98935825,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ])

While this is a small case, timeit suggests it doesn't have much advantage over the mask use of `@divakar's answer:

In [104]: timeit np.sin(x,where=mask,out=y)
5.17 µs ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [105]: timeit y[mask] = np.sin(x[mask])
4.69 µs ± 9.54 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

(for much larger x, the where parameter has a slight time advantage over the mask use.)

Sign up to request clarification or add additional context in comments.

Comments

2

Here's a mask based one that operates with np.sin only on the valid ones -

out = np.zeros(x.shape)
mask = x <= 10
out[mask] = np.sin(x[mask])

Leveraging numexpr module for faster transcendental operations -

import numexpr as ne

out = np.zeros(x.shape)
mask = x <= 10
x_masked = x[mask]
out[mask] = ne.evaluate('sin(x_masked)')

2 Comments

Why is numexpr faster for sin? My understanding was that it only helped for compound expressions
@Eric Why its faster with transcendental ones (sin, etc)? No idea, but it seems to be. Not sure we can dig deep to find out though.
2

Note that your function does not take advantage of numpy's vectorisation. There are a few possible options.

Option 1
This seems like a good use case for np.where -

y = np.where(x > 10, 0, np.sin(x))

Which returns values based on the mask provided. Here's a sample -

x
array([  0.1,   0.2,   0.3,  11. ,   0.1,  11. ])

np.where(x > 10, 0, np.sin(x))
array([ 0.09983342,  0.19866933,  0.29552021,  0.        ,  0.09983342,  0.        ])

Note that this method still calls the "expensive function" for each element.


Option 2
Another possibility is to use a mask and set values conditionally -

y = np.sin(x)
y[x > 10] = 0

Similar to above, you could multiply x by a mask and call np.sin on the result -

y = np.sin(x * (x < 10))

As Divakar mentioned, you can use numexpr with this condition -

import numexpr as ne
y = ne.evaluate("sin(x * (x < 10))")

This should be faster than the ones above.

5 Comments

So fast! Thanks!
For transcendental ones, numexpr could be helpful.
@Divakar I've not worked with this before, but it looks cool! How would you get it to work with a condition? ne.evaluate("sin(x) if x < 10 else 0") seems to not be right.
I guess it would work better with the mask one, as suggested at the end of the post.
These methods still call the 'expensive' function on all values of x.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.