1

am trying to get from the following dataframe:

          DAY    Col1    ColA    ColB    ColC
    ID    
    ABC   Mon    A        123
    DEF   Mon    A        456
    GHI   Mon    A        789
    ABC   Tue    A                123
    DEF   Tue    A                456
    GHI   Tue    A                789
    ABC   Wed    A                        123
    DEF   Wed    A                        456
    GHI   Wed    A                        789

into:

    ID    Mon    Tue    Wed
    ABC   123    123    123
    DEF   456    456    456
    GHI   789    789    789

So the idea would be to remove the empty cells, and reclassify the columns into unique Days followed by the ID's corresponding value in that Day.

Appreciate any help I get, thanks!

1 Answer 1

1

You can use:

df = (df.drop('Col1', 1)
        .set_index(['ID','DAY'])
        .stack()
        .reset_index(level=2, drop=True)
        .unstack())
print (df)
DAY    Mon    Tue    Wed
ID                      
ABC  123.0  123.0  123.0
DEF  456.0  456.0  456.0
GHI  789.0  789.0  789.0

Explanation:

  1. Remove unnecessary column Col1 by drop
  2. Create index by set_index
  3. Reshape by stack for remove NaNs and columns names to MultiIndex
  4. Remove 2 level of MultiIndex by reset_index
  5. Reshape by unstack

EDIT:

df = (df.drop('Col1', 1)
        .set_index('DAY', append=True)
        .stack()
        .reset_index(level=2, drop=True)
        .unstack()
        )
print (df)
DAY    Mon    Tue    Wed
ID                      
ABC  123.0  123.0  123.0
DEF  456.0  456.0  456.0
GHI  789.0  789.0  789.0

EDIT1: Add reindex:

df = (df.drop('Col1', 1)
        .set_index('DAY', append=True)
        .stack()
        .reset_index(level=2, drop=True)
        .unstack()
        .reindex(columns=['Wed','Tue','Mon'])
        )
print (df)
DAY    Wed    Tue    Mon
ID                      
ABC  123.0  123.0  123.0
DEF  456.0  456.0  456.0
GHI  789.0  789.0  789.0
Sign up to request clarification or add additional context in comments.

18 Comments

Really interesting +1
@RaoSahab - Thank you.
Just realised the dataframe I have at the start is a MultiIndex - how can i remedy that please?
@Singapore123 - What is print (df.index.nlevels) ?
@jezrael Not too sure either. I made some edits to the original table in the question. Table I'm starting with has a multiindex, vs just single before
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.