2

I have been asked to plot some histograms and KDEs using seaborn. We just want to focus on a range of the X axis, so I use ax.set_xlim(20_000, 42_460). In some cases most of the data is before 20.000 so the plot looks like this:

Plot from 20.000 to 42.460

The full plot looks like this:

Full plot

There is data, but since most of it is on the range (0,20.000) matplotlib adjusts the Y limit to it and in the range (20.000, 42.460) the data cannot be appreciated.

I would like to know a way to automatically adjust the Y limit so the data in the range (20.000, 42.460) is visible. I have been asked to not to plot just the range (20.000, 42.460), I have to plot the range (0, 42.460) and then zoom in the range (20.000, 42.460).

I have found Axes.relim() that can take an argument visible_only=True but it does not seem to work as I expected.

Other option could be to use a different library to calculate the histogram data, calculate then the Y limit and set it with ax.set_ylim(0, range_max) but we are also plotting a seaborn KDE that has the same problem and that could be more complicated. Here it is an immage of a good plot:

Good plot

EDIT:

To reproduce the plot use this data and this code:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

ages = {
    '25-34':'blue',
    '35-44': 'orange',
    '45-54': 'green',
    '55-64': 'red',
}
markers = [plt.Line2D([0,0],[0,0],color=color, marker='o', linestyle='') for color in ages.values()]

data = pd.read_csv("./data.csv")

min = 20_000
max = 42_460

fig = plt.figure(figsize=(10,11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(),loc='center right')
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5,1,5])
axs = gs.subplots(sharex=True, sharey=False)

sns.histplot(data=data, x="data", bins=200,ax=axs[0])
#axs[1].plot([pos[0] for pos in m1.elevation], [pos[1] for pos in m1.elevation])
sns.kdeplot(data=data, x="data", hue="labels",
    common_norm=False,bw_adjust=.25,ax=axs[2]
    ,legend=False, palette=ages.values(), hue_order=ages.keys())

plt.rcParams['xtick.bottom'] = plt.rcParams['xtick.labelbottom'] = True
plt.rcParams['xtick.top'] = plt.rcParams['xtick.labeltop'] = False

axs[0].set_axisbelow(True)
axs[0].set_xlim(min, max)
axs[0].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[0].grid(b=True, which='major', color='#cccccc20',lw=0.8)
axs[0].relim(visible_only=True)

axs[1].set_ylim(0, 40)
axs[1].set_xticks(np.arange(min, max, 2500))
axs[1].set_xticks(np.arange(min, max, 500), minor=True)
axs[1].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[1].grid(b=True, which='major', color='#cccccc20',lw=0.8)

axs[2].set_axisbelow(True)
axs[2].set_xlim(min, max)
axs[2].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[2].grid(b=True, which='major', color='#cccccc20',lw=0.8)

The plot on the middle has been omited since it was not interesting and made the code simpler.

1
  • Sorry, I was just looking for way of doing, maybe the name a function that I am not aware of. I will try to modify my code to make it less dependant on the rest of the project and edit the question to include it. Thanks! Commented May 30, 2021 at 19:41

1 Answer 1

2

The fastest way to achieve what you want is to subset your DataFrame:

# I've renamed these as they override the builtin min/max
min_ = 20_000
max_ = 42_460

# `mask` is an array of True/False that allows 
#   us to select a subset of the DataFrame
mask = data["data"].between(min_, max_, inclusive=True)
plot_data = data[mask]

If you set cut=0 on the sns.kdeplot you shouldn't need to set the xlim for the axes, but this may truncate some lines. I've left it out because I think it looks better without it.

Also, as you use sharex on your subplots, I think you only need to call set_xlim once.

Then use plot_data to plot your charts:

ages = {"25-34": "blue", "35-44": "orange", "45-54": "green", "55-64": "red"}
markers = [
    plt.Line2D([0, 0], [0, 0], color=color, marker="o", linestyle="")
    for color in ages.values()
]

fig = plt.figure(figsize=(10, 11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(), loc="center right")
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5, 1, 5])
axs = gs.subplots(sharex=True, sharey=False)

sns.histplot(data=plot_data, x="data", bins=200, ax=axs[0])

sns.kdeplot(
    data=plot_data,
    x="data",
    hue="labels",
    common_norm=False,
    bw_adjust=0.25,
    ax=axs[2],
    legend=False,
    palette=ages.values(),
    hue_order=ages.keys(),
    # cut=0,
)

axs[0].set_axisbelow(True)
axs[0].set_xlim(min_, max_)
axs[0].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[0].grid(b=True, which="major", color="#cccccc20", lw=0.8)

# omit ax[1]

axs[2].set_axisbelow(True)
axs[2].set_xlim(min_, max_)
axs[2].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[2].grid(b=True, which="major", color="#cccccc20", lw=0.8)

Which outputs:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.