1

I'm trying to transform a column containing a list of values to a new set of columns, one for each value over all rows of that column.

For example given :

index    cat
0       ['a','b']
1       ['c','a','d']
2       ['e','b','c']

I'd like to get :

index    a       b       c        d         e
0        1       1       0        0         0
1        1       0       1        1         0
2        0       1       1        0         1

Could you help me and point me in the right direction?Thanks

1 Answer 1

3

Use:

#df=df.set_index('index') #if index is a column
d=df.explode('cat')
new_df=pd.crosstab(d.index,d.cat)
print(new_df)

Output

cat    a  b  c  d  e
row_0               
0      1  1  0  0  0
1      1  0  1  1  0
2      0  1  1  0  1

print(df)
         cat
0     [a, b]
1  [c, a, d]
2  [e, b, c]
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, but where is that row_0 coming from?
I think it's due to the name of the index, you can use: pd.crosstab(df.index,df.YEAR,rownames=['myrowname']) to change the name or new_df.rename_axis(index=None) to remove it! pandas.pydata.org/pandas-docs/stable/reference/api/…
Yea, should have read the doc before typing :d Well thanks a lot. I saw that the question got downvoted and see you have quite some rep, any clue on how to improve the question?
You're welcome :) I really don't understand why this question was rejected, I think it's quite clear and simple to understand. I already upvote:)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.