How to map values in a list to a pandas dataframe with binary values

Question

I have a nested list with string values that I used to create a list with binary values. I used the transformed list as predictors in my model.

The list with string values -

D = [["An", "Cn"], ["Bs", "Gt"], ["Cd", "El"], ["Cd", "Cn", "En"]]

With

D_tran = pd.Series([';'.join(i) for i in D]).str.get_dummies(';')

I obtained D_tran

   An  Bs  Cd  Cn  El  En  Gt
0   1   0   0   1   0   0   0
1   0   1   0   0   0   0   1
2   0   0   1   0   1   0   0
3   0   0   1   1   0   1   0

With

D_list = D_tran.values.tolist()

I obtained D_list:

[[1, 0, 0, 1, 0, 0, 0], [0, 1, 0, 0, 0, 0, 1], [0, 0, 1, 0, 1, 0, 0], [0, 0, 1, 1, 0, 1, 0]]

I use this to create a linear regression model. To test my model, however, I need to transform the string values in my test data to be binary. The test data looks like -

R = [["Bs"], ["Cd", "El"], ["An"]]

My question is how to map R into the frame of D_list in order to obtain

R = [[0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 1, 0, 0], [1, 0, 0, 0, 0, 0, 0]]

Please note that, in the test data, only part of the predictors appear.

Thank you very much for your assistance.

root · Accepted Answer · 2016-07-27 22:19:40Z

1

You can essentially do the same procedure as before with one minor modification: after creating the dummies dataframe, use reindex with the columns of D_tran:

R_tran = pd.Series([';'.join(i) for i in R]).str.get_dummies(';')
R_tran = R_tran.reindex(columns=D_tran.columns, fill_value=0)
R_list = R_tran.values.tolist()

answered Jul 27, 2016 at 22:19

root

34.1k6 gold badges77 silver badges89 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to map values in a list to a pandas dataframe with binary values

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related