Making Sense of Stack/Unstack/Pivot_Table/etc [duplicate]

Question

I have a Pandas DataFrame that looks like the following:

    ID           Date      Element  Temperature (C)
0   USW00094889 2014-11-12  TMAX    2.2
1   USC00208972 2009-04-29  TMIN    5.6
2   USC00200032 2008-05-26  TMAX    27.8
3   USC00205563 2005-11-11  TMAX    13.9
4   USC00200230 2014-02-27  TMAX    -10.6

I've been staring at the documentation for the stack and unstack functions and related functions like melt and pivot_table but I can't quite seem to figure out the combination of these that I need to do what I want. I'm trying to get rid of the "Element" column and instead create two new columns, "TMAX" and "TMIN". This would make each row a unique observation. Each Date/ID combo has a TMAX and TMIN. So theoretically this new dataframe would be half the height of the current one.

HI, welcome to StackOverflow. What have you tried so far? What was the result? Can you link to some other SO posts that address, but do not solve, this problem? — Evan
– Evan, Commented Dec 3, 2018 at 21:19
I would recommend including your desired output in your post as well. — rahlf23
– rahlf23, Commented Dec 3, 2018 at 21:20

Evan · Accepted Answer · 2018-12-03 21:31:57Z

Docs: https://pandas.pydata.org/pandas-docs/stable/reshaping.html

Create the DF and create the desired multi-index.

"""
ID,Date,Element,Temperature (C)
USW00094889,2014-11-12,TMAX,2.2
USC00208972,2009-04-29,TMIN,5.6
USC00200032,2008-05-26,TMAX,27.8
USC00205563,2005-11-11,TMAX,13.9
USC00200230,2014-02-27,TMAX,-10.6
"""
import pandas as pd
df = pd.read_clipboard(sep=",").set_index(["ID", "Date"])
print(df)

Output:

                       Element  Temperature (C)
ID          Date                               
USW00094889 2014-11-12    TMAX              2.2
USC00208972 2009-04-29    TMIN              5.6
USC00200032 2008-05-26    TMAX             27.8
USC00205563 2005-11-11    TMAX             13.9
USC00200230 2014-02-27    TMAX            -10.6

df.pivot(index=df.index, columns="Element")

Output:

                       Temperature (C)     
Element                           TMAX TMIN
ID          Date                           
USC00200032 2008-05-26            27.8  NaN
USC00200230 2014-02-27           -10.6  NaN
USC00205563 2005-11-11            13.9  NaN
USC00208972 2009-04-29             NaN  5.6
USW00094889 2014-11-12             2.2  NaN

Note that I dropped the numerical ID from your sample df.

Collectives™ on Stack Overflow

Making Sense of Stack/Unstack/Pivot_Table/etc [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related