Adding column to numpy array based on where condition

Question

I have a numpy array and I would like to update a column of values in it with data from a second array. Somewhat like a vlookup function in excel.

Need to look-up the first column of a in the b table. Then replace the second column in a with the number from the second column in b.

import numpy as np

# type, newval
a = np.array( [[1, 23, 0],
              [2, 24, 0],
              [1, 15, 0],
              [1, 27, 0],
              [6, 22, 0],
              [1, 18, 0]]
              )

# type, newval
b = np.array([[1, 1.1],
            [2, 2.1],
            [3, 3.1],
            [4, 4.1],
            [5, 5.1],
            [6, 6.1]]
            )

a[:,2] = np.where(b[:,0] == a[:,0], b[:,1], None)

Expected result Note: I would like the original array a to be updated with the lookup values.

a = array( [[1, 23, 1.1],
            [2, 24, 2.1],
            [1, 15, 1.1],
            [1, 27, 1.1],
            [6, 22, 6.1],
            [1, 18, 1.1]]
          )

What I get however is nan beside the last 4 items in the array. It likes like my np.where condition is replacing the value where the position AND the number are correct, not just where the number matches.

Note, the b array can be a list or any other type of object if it makes things easier. The a array is read in from a file so I'd prefer not to change the structure of that. — Carl
– Carl, Commented Sep 28, 2014 at 9:02
Are you sure that your "Expected result" is correct? In the second row, second column i would expect a 2.1. Rows 5 and 6 seems to be interchanged. — zinjaai
– zinjaai, Commented Sep 28, 2014 at 10:00
Possible duplicate of SQL join or R's merge() function in NumPy? — Georgy
– Georgy, Commented Apr 22, 2019 at 10:11

farenorth · Accepted Answer · 2014-09-30 15:52:17Z

3

You can transorm the array b into a dictionary. Afterwards the desired result can be archived by list comprehension.

b_as_dict = dict(b)
res = [[k, b_as_dict[k]] for k in a[:,0]]

Regarding inserting these results into a:

Currently a is an integer array. To get these results into a you'll probably want to define it as float or float32 (because the values you're trying to insert are floats):

a = np.array([[1, 23, 0],
          [2, 24, 0],
          [1, 15, 0],
          [1, 27, 0],
          [6, 22, 0],
          [1, 18, 0]],
         dtype=np.float32)

Then you can use list comprehensions as Zinjaai suggested:

a[:, 2] = [b_as_dict[k] for k in a[:, 0]]

edited Sep 30, 2014 at 15:52

farenorth

11k2 gold badges45 silver badges46 bronze badges

answered Sep 28, 2014 at 9:55

zinjaai

2,3951 gold badge19 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Carl Over a year ago

Thanks. Any way to get these results into the original array? Thing is, I have more columns in a than displayed in the example I gave above.

sebix · Accepted Answer · 2014-09-28 09:51:30Z

1

If b is sorted and consecutive, the simples solution is:

In [19]: b[a[:,0]-1]
Out[19]: 
array([[ 1. ,  1.1],
       [ 2. ,  2.1],
       [ 1. ,  1.1],
       [ 1. ,  1.1],
       [ 6. ,  6.1],
       [ 1. ,  1.1]])

Or, a bit slower:

In [20]: a[:,0]
Out[20]: array([1, 2, 1, 1, 6, 1])

By subtraction 1, these are the indices of our array b

In [21]: a[:,0]-1
Out[21]: array([0, 1, 0, 0, 5, 0])

Now we just read these rows from b.

answered Sep 28, 2014 at 9:51

sebix

3,2972 gold badges31 silver badges49 bronze badges

1 Comment

farenorth Over a year ago

I like this solution too. If the lookup table b can be sorted in this way it is probably faster.

Collectives™ on Stack Overflow

Adding column to numpy array based on where condition

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related