How to create Pandas groupby plot with subplots

Question

I have a data frame like this:

     value     identifier
2007-01-01  0.781611      55
2007-01-01  0.766152      56
2007-01-01  0.766152      57
2007-02-01  0.705615      55
2007-02-01  0.032134      56 
2007-02-01  0.032134      57
2008-01-01  0.026512      55
2008-01-01  0.993124      56
2008-01-01  0.993124      57
2008-02-01  0.226420      55
2008-02-01  0.033860      56
2008-02-01  0.033860      57

So I do a groupby per identifier:

df.groupby('identifier')

And now I want to generate subplots in a grid, one plot per group. I tried both

df.groupby('identifier').plot(subplots=True)

or

df.groupby('identifier').plot(subplots=False)

and

plt.subplots(3,3)
df.groupby('identifier').plot(subplots=True)

to no avail. How can I create the graphs?

Thanks, but I'm trying to avoid seaborn and use matplotlib only instead. Dependencies and Windows environment, etc etc. — Ivan
– Ivan, Commented Apr 30, 2015 at 19:15
Old comment, but seaborn is an API for matplotlib. Seaborn reduces this to 1 line without any dataframe transformations: sns.relplot(kind='line', data=df.reset_index(), row='identifier', x='index', y='value'). — Trenton McKinney
– Trenton McKinney, Commented Aug 25, 2021 at 20:41

cphlewis · Accepted Answer · 2015-05-01 03:42:56Z

21

Here's an automated layout with lots of groups (of random fake data) and playing around with grouped.get_group(key) will show you how to do more elegant plots.

import pandas as pd
from numpy.random import randint
import matplotlib.pyplot as plt


df = pd.DataFrame(randint(0,10,(200,6)),columns=list('abcdef'))
grouped = df.groupby('a')
rowlength = grouped.ngroups/2                         # fix up if odd number of groups
fig, axs = plt.subplots(figsize=(9,4), 
                        nrows=2, ncols=rowlength,     # fix as above
                        gridspec_kw=dict(hspace=0.4)) # Much control of gridspec

targets = zip(grouped.groups.keys(), axs.flatten())
for i, (key, ax) in enumerate(targets):
    ax.plot(grouped.get_group(key))
    ax.set_title('a=%d'%key)
ax.legend()
plt.show()

enter image description here

edited May 1, 2015 at 3:42

answered Apr 30, 2015 at 19:29

cphlewis

16.3k4 gold badges52 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Geoff Lentsch Over a year ago

You mentioned fix if odd, so: rowlength = grouped.ngroups/2 + (0 if grouped.ngroups % 2 == 0 else 1)

cs95 Over a year ago

It's helpful to understand that the reason this works is that you generate a bunch of axes, and pass each axis object in turn to each group being plotted. You're filling each subfigure with a sub-group plot. Neat!

Zero · Accepted Answer · 2020-07-06 23:17:30Z

20

You could use pd.pivot_table to get the identifiers in columns and then call plot()

pd.pivot_table(df.reset_index(),
               index='index', columns='identifier', values='value'
              ).plot(subplots=True)

enter image description here

And, the output of

pd.pivot_table(df.reset_index(),
               index='index', columns='identifier', values='value'
               )

Looks like -

identifier        55        56        57
index
2007-01-01  0.781611  0.766152  0.766152
2007-02-01  0.705615  0.032134  0.032134
2008-01-01  0.026512  0.993124  0.993124
2008-02-01  0.226420  0.033860  0.033860

edited Jul 6, 2020 at 23:17

answered Apr 30, 2015 at 19:37

Zero

77.4k22 gold badges153 silver badges153 bronze badges

Comments

Gabriel · Accepted Answer · 2019-03-15 05:57:13Z

4

If you have a series with multiindex. Here's another solution for the wanted graph.

df.unstack('indentifier').plot.line(subplots=True)

answered Mar 15, 2019 at 5:57

Gabriel

1602 gold badges2 silver badges12 bronze badges

Comments

beyondfloatingpoint · Accepted Answer · 2019-07-04 11:09:34Z

Here is a solution to those, who need to plot graphs for exploring different levels of aggregation by multiple columns grouping.

from numpy.random import randint
from numpy.random import randint
import matplotlib.pyplot as plt
import numpy as np

levels_bool = np.tile(np.arange(0,2), 100)
levels_groups = np.repeat(np.arange(0,4), 50)
x_axis = np.tile(np.arange(0,10), 20)
values = randint(0,10,200)

stacked = np.stack((levels_bool, levels_groups, x_axis, values), axis=0)
df = pd.DataFrame(stacked.T, columns=['bool', 'groups', 'x_axis', 'values'])

columns = len(df['bool'].unique())
rows = len(df['groups'].unique())
fig, axs = plt.subplots(rows, columns, figsize = (20,20))

y_index_counter = count(0)
groupped_df = df.groupby([ 'groups', 'bool','x_axis']).agg({
    'values': ['min', 'mean', 'median', 'max']
})
for group_name, grp in groupped_df.groupby(['groups']):
    y_index = next(y_index_counter)
    x_index_counter = count(0)
    for boolean, grp2 in grp.groupby(['bool']):
        x_index = next(x_index_counter)
        axs[y_index, x_index].plot(grp2.reset_index()['x_axis'], grp2.reset_index()['values'], 
                                   label=str(key)+str(key2))
        axs[y_index, x_index].set_title("Group:{} Bool:{}".format(group_name, boolean))

ax.legend()
plt.subplots_adjust(hspace=0.5)
plt.show()

Collectives™ on Stack Overflow

How to create Pandas groupby plot with subplots

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related