2

I need to have a list containing all specific element's columns for every index. For example, this DataFrame:

>>> df
                     1           2           3           4           5
2016-01-27           A           B           B           I           I  
2016-03-07           A           C           D           U           U   
2016-04-12           H           A           V           V           V   
2016-05-02           B           L           Y           S           N   
2016-05-23           L           N           N           A           S  

Inputting "A" I'd like to have this list as output:

[1,1,2,NaN,4]

Is there a built-in method for this?

Edit: In the original table all items in a row are unique, when editing original table to make it less "dense" to post here and I made this mistake, sorry.

2
  • 1
    Do you want the first index of the input? What would 'B' return for row 1? Commented Nov 1, 2016 at 19:18
  • In the original table all items in a row are unique, sorry, I edited the original table to make it less "dense" to post here and I made this mistake. Commented Nov 1, 2016 at 19:23

1 Answer 1

2

You may want to melt your data frame to long format and then calculate the corresponding list of columns for each input(value), After obtaining the Series as follows, it would be easy for you to query the result for any intended input:

import pandas as pd
pd.melt(df).groupby('value').variable.apply(list)

#value
#A    [1, 1, 2, 4]
#B       [1, 2, 3]
#C             [2]
#D             [3]
#H             [1]
#I          [4, 5]
#L          [1, 2]
#N       [2, 3, 5]
#S          [4, 5]
#U          [4, 5]
#V       [3, 4, 5]
#Y             [3]
#Name: variable, dtype: object

To get the list of columns for input A:

result = pd.melt(df).groupby('value').variable.apply(list)

result['A']
# ['1', '1', '2', '4']
Sign up to request clarification or add additional context in comments.

4 Comments

This works well, but is there a way to get a "NaN" value when there is no 'A' in the row?
Is it guaranteed each row has at most one A? What if it has multiple As? Which one you want to keep?
Yes, all items in a row are unique, I made this mistake when editing the table to look less "dense" in here, I just edited original post.
Then you may try something as follows. df.apply(lambda r: (r == 'A').idxmax() if any(r == 'A') else np.nan, axis = 1).tolist().

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.