2

I currently have raw data like this:

person1  person2   person3...
  blue     red      green
  red      blue     yellow
  black    black
  white    green
  orange

with lots of different values and columns.

What I need is:

         Blue  Red  Black  Green Yellow Orange White
Person1   Y     Y     Y                    Y     Y
Person2   Y     Y     Y      Y
Person3                      Y      Y

Any suggestions please?

Thanks

2
  • How is this data stored? In a file? data frame? Commented Nov 13, 2019 at 21:19
  • Hey, stored as a csv file so excel table format Commented Nov 13, 2019 at 21:29

4 Answers 4

3

Method 1: DataFrame.apply + pd.value_counts

new_df=df.apply(pd.value_counts).replace({1:'Y',np.nan:''}).T
print(new_df)

        black blue green orange red white yellow
person1     Y    Y            Y   Y     Y       
person2     Y    Y     Y          Y             
person3                Y                       Y 

Method 2: pd.crosstab + DataFrame.melt

df2=df.melt()
new_df=pd.crosstab(df2['variable'],df2['value']).replace({0:'',1:'Y'}).rename_axis(index=None,columns=None)
print(new_df)
        black blue green orange red white yellow
person1     Y    Y            Y   Y     Y       
person2     Y    Y     Y          Y             
person3                Y                       Y
Sign up to request clarification or add additional context in comments.

2 Comments

ansev, for you first method to work i needed to run new_df1 = new_df1.drop(new_df1.columns[0], axis=1) . What version are you running? I upvoted you.
I use 0.25.2, You can also try: new_df=df.apply(lambda x: x.value_counts()).replace({1:'Y',np.nan:''}).T. Thanks by upvote:)
1

I will use get_dummies (you can add map({True:'Y',False:''}) at the end )

s=pd.get_dummies(df1)
s.columns=pd.MultiIndex.from_tuples(s.columns.str.split('_').map(tuple))
Yourdf=s.stack(0).sum(level=1).eq(1)
Yourdf
Out[132]: 
         black   blue  green  orange    red  white  yellow
person1   True   True  False    True   True   True   False
person2   True   True   True   False   True  False   False
person3  False  False   True   False  False  False    True

Or

pd.concat([df1[x].str.get_dummies() for x in df1.columns],keys=df1.columns,axis=1).\
        stack(1).sum(level=1).T.eq(1)
Out[164]: 
         black   blue  green  orange    red  white  yellow
person1   True   True  False    True   True   True   False
person2   True   True   True   False   True  False   False
person3  False  False   True   False  False  False    True

Comments

0

I have a primitive approach using dictionary data type and print function,

columns = ("Blue", "Red", "Black", "Green", "Yellow", "Orange", "White")

table_dict = {"Person1": ("Y", "Y", "Y", " ", " ",  "Y", "Y"),
          "Person2": ("Y", "Y", "Y", "Y ", " ", " ", " "),
          "Person3": (" ", " ", " ", "Y", "Y", " ", " ")}

print(" "*5, *columns, sep=" "*5)

for person in table_dict:
    print(person, end=" "*4)
    print(*table_dict.get(person), sep=" "*9)

Output:

          Blue     Red     Black     Green     Yellow     Orange     White
Person1    Y         Y         Y                             Y         Y
Person2    Y         Y         Y         Y                               
Person3                                  Y         Y                    

Comments

0

This is a working implementation, let me know what you think:

d1={'person1': ['blue', 'red', 'black', 'white', 'orange'], 'person2': ['red', 'blue', 'black', 'green', ''], 'person3': ['green', 'yellow', '', '', '']}
df1 = pd.DataFrame(data=d1)
new_df1 = df1.apply(pd.value_counts).replace({1:'Y',np.nan:''})
new_df1 = new_df1.reset_index().drop(df1.index[0]).T

new_df1
             1     2      3       4    5      6       7
index    black  blue  green  orange  red  white  yellow
person1      Y     Y              Y    Y      Y        
person2      Y     Y      Y            Y               
person3                   Y                           Y

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.