3

I have a pandas dataframe like

data = [[0, 10, 22000, 3], 
        [1, 15, 42135, 4], 
        [0, 14, 13526, 5],
        [0, 16, 32156, 3], 
        [1, 23, 13889, 5], 
        [0, 18, 18000, 6], 
        [0, 21, 13189, 2], 
        [1, 32, 58766, 2]] 

df = pd.DataFrame(data, columns = ['Gender', 'Age', 'Amount','Dependents']) 

And I have a numpy array

arr = numpy.array([[1, 15, 42135, 4],
       [1, 23, 13889, 5],
       [0, 21, 13189, 2]])

Here I would like to create a new column in the dataframe 'data'(say 'Good_Bad') with 1 if the array present in data.

The result should be like

data = [[0, 10, 22000, 3, 0], 
        [1, 15, 42135, 4, 1], 
        [0, 14, 13526, 5, 0],
        [0, 16, 32156, 3, 0], 
        [1, 23, 13889, 5, 1], 
        [0, 18, 18000, 6, 0], 
        [0, 21, 13189, 2, 1], 
        [1, 32, 58766, 2, 0]] 

The records 2,5,7 has 1 in the new column and other records have 0. Not sure how to map array and dataframe.

1 Answer 1

5

Approach #1

Vectorized one with broadcasting -

dfc = df[['Gender','Age','Amount','Dependents']] # select relevant cols
df['Good_Bad'] = (dfc.values[:,None]==arr).all(2).any(1).astype(int)

On newer pandas versions (>= v0.24), use dfc.to_numpy(copy=False) instead of dfc.values.

Approach 2

Here's one with views for memory and hence performance efficiency -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    # This function gets 1D view into 2D input arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[-1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

D,A = view1D(dfc,arr)
df['Good_Bad'] = np.isin(D,A).astype(int)
Sign up to request clarification or add additional context in comments.

2 Comments

If the data has bool columns, all() getting not applied
@hanzgs If you have extra columns in df that are not to be included for mapping, then select only the cols reqd for mapping. So, replace df with df[['Gender', 'Age', 'Amount', 'Dependents']] at all places in the code. Does that answer your question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.