2

This question is related to the following post: Replacing Numpy elements if condition is met.

Suppose i have two, one-dimensional numpy arrays a and b, with 50 rows each.

I would like to create an array c of 50 rows, each of which will take the values 0-4 depending on whether a condition is met:

if a > 0 the value in the corresponding row of c should be 0
if a < 0 the value in the corresponding row of c should be 1
if a > 0 and b < 0 the value in the corresponding row of c should be 2
if b > 0 the value in the corresponding row of c should be 3

I suppose the broader question here is how can i assign specific values to an array when there are multiple conditions. I have tried variations from the post i referenced above but i have not been successful.

Any ideas of how i could achieve this, preferably without using a for-loop?

2
  • 1
    Some of your conditions contradict. For example, if a > 0 then 0, and if a > 0 and b < 0 then 2. Is there priority? Commented Mar 6, 2018 at 16:27
  • The two conditions do not contradict in the sense that when b<0 then a>0 and b<0 is a separate condition in its own right. Commented Mar 6, 2018 at 17:18

3 Answers 3

4

A straight forward solution would be to apply the assignments in sequence.

In [18]: a = np.random.choice([-1,1],size=(10,))
In [19]: b = np.random.choice([-1,1],size=(10,))
In [20]: a
Out[20]: array([-1,  1, -1, -1,  1, -1, -1,  1,  1, -1])
In [21]: b
Out[21]: array([-1,  1,  1,  1, -1,  1, -1,  1,  1,  1])

Start off with an array with the 'default' value:

In [22]: c = np.zeros_like(a)

Apply the second condition:

In [23]: c[a<0] = 1

The third requires a little care since it combines 2 tests. () matter here:

In [25]: c[(a>0)&(b<0)] = 2

And the last:

In [26]: c[b>0] = 3
In [27]: c
Out[27]: array([1, 3, 3, 3, 2, 3, 1, 3, 3, 3])

Looks like all of the initial 0s are overwritten.

With many elements in the arrays, and just a few tests, I wouldn't worry about speed. Focus on clarity and expressiveness, not compactness.

There is a 3 argument version of where that can choose between values or arrays. But I rarely use it, and don't see many questions about it either.

In [28]: c = np.where(a>0, 0, 1)
In [29]: c
Out[29]: array([1, 0, 1, 1, 0, 1, 1, 0, 0, 1])
In [30]: c = np.where((a>0)&(b<0), 2, c)
In [31]: c
Out[31]: array([1, 0, 1, 1, 2, 1, 1, 0, 0, 1])
In [32]: c = np.where(b>0, 3, c)
In [33]: c
Out[33]: array([1, 3, 3, 3, 2, 3, 1, 3, 3, 3])

These wheres could be chained on one line.

c = np.where(b>0, 3, np.where((a>0)&(b<0), 2, np.where(a>0, 0, 1)))
Sign up to request clarification or add additional context in comments.

2 Comments

The where chain is also sequential, correct? Or the ordering does not matter?
The inner most where is evaluated and the result is passed as argument to next. Standard python evaluation.
2

A generalised solution to this problem is possible via np.select. You need only supply a list of conditions and choices:

np.random.seed(0)

a = np.random.randint(-10, 10, (5, 5))
b = np.random.randint(-10, 10, (5, 5))

conditions = [b > 0, (a > 0) & (b < 0), a < 0, a > 0]
choices = [3, 2, 1, 0]

res = np.select(conditions, choices)

array([[3, 3, 1, 3, 1],
       [3, 3, 3, 3, 3],
       [1, 2, 1, 1, 1],
       [0, 2, 3, 3, 1],
       [1, 2, 2, 2, 1]])

Comments

1

as @chrisz points out, you currently have overlapping conditions. This is how I'd go about using multiple if statements:

import numpy as np
a = np.random.random(50)*10 - 10
b = np.random.random(50)*10 - 10
c = [0*(a>0)*(b<0) + 1*(a<0) + 3*(a==0)*(b>0)]

The compare statements return 1 if true and 0 otherwise. By multiplying them and adding different statements you can make multiple if statements. However, this ONLY WORKS if the if statements don't overlap.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.