1

I import data from 3 different dataframes (all with the same keys) and put it together to 1 single dataframe.

df1 = read_xlsx('Means_Cent')
df2 = read_xlsx('Means_Rand')
df3 = read_xlsx('Means_Const')
df1['Key'] = 'Cent'
df2['Key'] = 'Rand'
df3['Key'] = 'Const'

df_means = pd.concat([df1,df2,df3], keys = ['Cent', 'Rand', 'Const'])

Now i want to create a plot using DataFrame.plot() where I have 1 graph for every key = ['Cent', 'Rand', 'Const'] in the same figure.

Part of my dataframe df_means looks like this:

         02_VOI  03_Solidity  04_Total_Cells
Cent  0   1.430       19.470           132.0
      1   1.415       18.880           131.0
      2   1.460       19.695           135.0
      3   1.520       19.695           141.0
Rand  0   1.430       19.205           132.0
      1   1.430       19.170           132.0
      2   1.445       19.430           133.5
      3   1.560       19.820           144.5
Const 0   1.175       22.695           108.5
      1   1.430       22.260           132.0
      2   1.180       21.090           109.0
      3   1.360       22.145           126.0

Now I want to plot 02_VOI vs 04_Total_Cells, and it should be 1 graph for each key ( g1 = 02_VOI(Cent) vs 04_Total_Cells(Cent), g2 = 02_VOI(Rand) vs 04_Total_Cells(Rand) ...)

I tried it using DataFrame.unstack():

df_means.unstack(level = 0).plot(x = '02_VOI', y = '04_Total_Cells')

but this seems to mess up the keys. It returns 9 graphs (1 for each combination of VOI(Cent,Rand,Const) vs Total_Cells(Cent,Rand,Const).

Thanks for your help, I'm also happy for tips on how to better connect the 3 initial dataframes.

1 Answer 1

2

I think I would use Seaborn plots for this. It is much easier. Seaborn likes "tidy" data.

import pandas as pd
import seaborn as sns
df_mean = pd.read_clipboard()
df_mean

Output:

         02_VOI  03_Solidity  04_Total_Cells
Cent  0   1.430       19.470           132.0
      1   1.415       18.880           131.0
      2   1.460       19.695           135.0
      3   1.520       19.695           141.0
Rand  0   1.430       19.205           132.0
      1   1.430       19.170           132.0
      2   1.445       19.430           133.5
      3   1.560       19.820           144.5
Const 0   1.175       22.695           108.5
      1   1.430       22.260           132.0
      2   1.180       21.090           109.0
      3   1.360       22.145           126.0

Reset index and rename columns as you wish.

df_mean = df_mean.reset_index()
df_mean = df_mean.rename(columns={'level_0':'Groups','level_1':'Samples'})
_ = sns.lmplot(x='02_VOI',y='04_Total_Cells', data=df_mean, scatter=True, col='Groups',fit_reg=False)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.