Plot Data with MultiIndex from pd.DataFrame

Question

I import data from 3 different dataframes (all with the same keys) and put it together to 1 single dataframe.

df1 = read_xlsx('Means_Cent')
df2 = read_xlsx('Means_Rand')
df3 = read_xlsx('Means_Const')
df1['Key'] = 'Cent'
df2['Key'] = 'Rand'
df3['Key'] = 'Const'

df_means = pd.concat([df1,df2,df3], keys = ['Cent', 'Rand', 'Const'])

Now i want to create a plot using DataFrame.plot() where I have 1 graph for every key = ['Cent', 'Rand', 'Const'] in the same figure.

Part of my dataframe df_means looks like this:

         02_VOI  03_Solidity  04_Total_Cells
Cent  0   1.430       19.470           132.0
      1   1.415       18.880           131.0
      2   1.460       19.695           135.0
      3   1.520       19.695           141.0
Rand  0   1.430       19.205           132.0
      1   1.430       19.170           132.0
      2   1.445       19.430           133.5
      3   1.560       19.820           144.5
Const 0   1.175       22.695           108.5
      1   1.430       22.260           132.0
      2   1.180       21.090           109.0
      3   1.360       22.145           126.0

Now I want to plot 02_VOI vs 04_Total_Cells, and it should be 1 graph for each key ( g1 = 02_VOI(Cent) vs 04_Total_Cells(Cent), g2 = 02_VOI(Rand) vs 04_Total_Cells(Rand) ...)

I tried it using DataFrame.unstack():

df_means.unstack(level = 0).plot(x = '02_VOI', y = '04_Total_Cells')

but this seems to mess up the keys. It returns 9 graphs (1 for each combination of VOI(Cent,Rand,Const) vs Total_Cells(Cent,Rand,Const).

Thanks for your help, I'm also happy for tips on how to better connect the 3 initial dataframes.

Scott Boston · Accepted Answer · 2017-04-12 13:53:11Z

I think I would use Seaborn plots for this. It is much easier. Seaborn likes "tidy" data.

import pandas as pd
import seaborn as sns
df_mean = pd.read_clipboard()
df_mean

Output:

         02_VOI  03_Solidity  04_Total_Cells
Cent  0   1.430       19.470           132.0
      1   1.415       18.880           131.0
      2   1.460       19.695           135.0
      3   1.520       19.695           141.0
Rand  0   1.430       19.205           132.0
      1   1.430       19.170           132.0
      2   1.445       19.430           133.5
      3   1.560       19.820           144.5
Const 0   1.175       22.695           108.5
      1   1.430       22.260           132.0
      2   1.180       21.090           109.0
      3   1.360       22.145           126.0

Reset index and rename columns as you wish.

df_mean = df_mean.reset_index()
df_mean = df_mean.rename(columns={'level_0':'Groups','level_1':'Samples'})
_ = sns.lmplot(x='02_VOI',y='04_Total_Cells', data=df_mean, scatter=True, col='Groups',fit_reg=False)

Collectives™ on Stack Overflow

Plot Data with MultiIndex from pd.DataFrame

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related