5

I'm trying to make a single boxplot chart area per month with different boxplots grouped by (and labeled) by industry and then have the Y-axis use a scale I dictate.

In a perfect world this would be dynamic and I could set the axis to be a certain number of standard deviations from the overall mean. I could live with another type of dynamically setting the y axis but I would want it to be standard on all the 'monthly' grouped boxplots created. I don't know what the best way to handle this is yet and open to wisdom - all I know is the numbers being used now are way to large for the charts to be meaningful.

I've tried all kinds of code and had zero luck with the scaling of axis and the code below was as close as I could come to the graph.

Here's a link to some dummy data: https://drive.google.com/open?id=0B4xdnV0LFZI1MmlFcTBweW82V0k

And for the code I'm using Python 3.5:

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
import pylab    
df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = (df.groupby('Industry'))
print(
df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10), whis=[5,95])
,pylab.show()
)
1
  • 2
    I like how you called the plotting function on a pandas.core.groupby.DataFrameGroupBy object, I did not know that was possible. Commented Nov 30, 2016 at 18:15

3 Answers 3

14

Here is a cleaned up version of your code with the solution:

import pandas as pd
import matplotlib.pyplot as plt

df =  pd.read_csv('Query_Final_2.csv')
df['Ship_Date'] = pd.to_datetime(df['Ship_Date'], errors = 'coerce')
df1 = df.groupby('Industry')

axes = df1.boxplot(column='Gross_Margin',layout=(1,9), figsize=(20,10),
                   whis=[5,95], return_type='axes')
for ax in axes.values():
    ax.set_ylim(-2.5, 2.5)

plt.show()

The key is to return the subplots as axes objects and set the limits individually.

Sign up to request clarification or add additional context in comments.

Comments

5

Once you have established variables for the mean and the standard deviation, use:

plt.ylim(ymin, ymax)

to set the y-axis.

Comments

-1

Thanks @Padraig, Please notice if you are using plt as a figure without subplot, you can use:

plt.ylim(ymin, ymax)

But if you want to adjust Y-axis of one sub plot this one works (@AlexG)

ax.set_ylim(ymin, ymax)

for instance if your subplot is ax2, and you want to have Y-axis from 0.5 to 1.0 your code will be like this:

ax2.set_ylim(0.5, 1.0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.