5

I have pandas dataframe where I have nested 4 categories (50,60,70,80) within two categories (positive, negative) and I would like to plot with seaborn kdeplot of a column (eg., A_mean...) based on groupby. What I want to achieve is this picture (this was done by splitting the pandas to a list). I went over several posts, this code (Multiple single plots in seaborn with pandas groupby data) works for one level but not for the two if I want to plot this for each Game_RS:

for i, group in df_hb_SLR.groupby('Condition'):
    sns.kdeplot(data=group['A_mean_per_subject'], shade=True, color='blue', label = 'label name')

I tried to use this one (Seaborn groupby pandas Series) but the first answer did not work for me:

sns.kdeplot(df_hb_SLR.A_mean_per_subject, groupby=df_hb_SLR.Game_RS)

AttributeError: 'Line2D' object has no property 'groupby'

and the pivot answer I was not able to make work. Is there a direct way from seaborn or any better way directly from pandas Dataframe?

My data are accessible in csv format under this link -- data and I load them as usual:

df_hb_SLR = pd.read_csv('data.csv')

Thank you for help.

1 Answer 1

4

Here is a solution using seaborn's FacetGrid, which makes this kind of things really easy

g = sns.FacetGrid(data=df_hb_SLR, col="Condition", hue='Game_RS', height=5, aspect=0.5)
g = g.map(sns.kdeplot, 'A_mean_per_subject', shade=True)
g.add_legend()

enter image description here

The downside of FacetGrid is that it creates a new figure, so If you'd like to integrate those plots into a larger ensemble of subplots, you could achieve the same result using groupby() and some looping:

group1 = "Condition"
N1 = len(df_hb_SLR[group1].unique())
group2 = 'Game_RS'
target = 'A_mean_per_subject'
height = 5
aspect = 0.5
colour = ['gray', 'blue', 'green', 'darkorange']


fig, axs = plt.subplots(1,N1, figsize=(N1*height*aspect,N1*height*aspect), sharey=True)

for (group1Name,df1),ax in zip(df_hb_SLR.groupby(group1),axs):
    ax.set_title(group1Name)
    for (group2Name,df2),c in zip(df1.groupby(group2), colour): 
        sns.kdeplot(df2[target], shade=True, label=group2Name, ax=ax, color = c) 

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

Hi Diziet, thank you very much for your answer. I have two small questions to the 2nd, for loop solution. 1) could you please describe how the looping over groupby objects work? Because 2) I wanted to add custom colour to each game, I tried to generalize your solution to: ``` colour = ['gray', 'blue', 'green', 'darkorange'] ... for (group1Name,df1),ax in zip(df_hb_SLR.groupby(group1),axs): ax.set_title(group1Name) for (group2Name,df2),colour in zip(df1.groupby(group2), colour): sns.kdeplot(df2[target], shade=True, label=group2Name, ax=ax, color = colour) ```
Your code was almost correct, except that you were using the same name colour twice in the for-loop. I've amended my answer.
One more question. If I change the groups, ie. group1 = 'Game_RS'; group2 = 'Condition' and then I want to use 2 rows, 2 cols, fig, axs = plt.subplots(2,2, figsize=(N1*height*aspect,N1*height*aspect), sharey=True) I get this error: AttributeError: 'numpy.ndarray' object has no attribute 'set_title'. Do you know why and how to fix it? If I use (1,N1), it works. Thanks
There are many posts about that error. For instance see here. Basically, if you're doing more that 1 row or 1 column, then the returned object is a 2D numpy array. You can loop using for (...),ax in zip(..., axs.flat):

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.