0

I am trying to replicate a plot in Figure 8 of this "https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2012JD017979" article using precipitation data.

Basically, it shows the distribution of (in my case) precipitation where the y-axis (oftentimes in distribution plots, the density) is the normalized volume, allowing different precipitation products to be comparable in one plot.

I have tried the code below using Scipy's stats.binned_statistic function and numpy:

compute mean of values in each bin, bin edges, and bin number

#NB: sim_id = simulated, obs_id = observed

im_bin_means, sim_bin_edges, sim_binnumber = stats.binned_statistic(x = sim_id,values =      sim_id,statistic='mean',bins=2000,range=(0.01,120))

sim_bin_cnt, sim_bin_edges, sim_binnumber = stats.binned_statistic(x = sim_id,values = sim_id,statistic='count',bins=2000,range=(0.01,120))

obs_bin_means, obs_bin_edges, obs_binnumber = stats.binned_statistic(x = obs_id,values = obs_id,statistic='mean',bins=2000,range=(0.01,120))

obs_bin_cnt, obs_bin_edges, obs_binnumber = stats.binned_statistic(x = obs_id,values = obs_id,statistic='count',bins=2000,range=(0.01,120))

compute a customized distribution as

sim_dist = (sim_bin_cnt*sim_bin_means)/np.nansum(sim_bin_cnt)
obs_dist = (obs_bin_cnt*obs_bin_means)/np.nansum(obs_bin_cnt)

the plot

f,ax = plt.subplots()

ax.plot(sim_bin_edges[:-1],sim_dist,c='r',label='SIM',lw=2,ls='-')
   
ax.plot(obs_bin_edges[:-1],obs_dist,c='k',label='OBS',lw=2,ls='--')

ax.set_xscale('log')

ax.legend(loc = 'best',frameon=False)  

yielding the plot below enter image description here

The problem is that the area under each curve should integrate to 1; i.e., np.nansum(sim_dist * np.diff(sim_bin_edges)) should be = 1. But this is not the case with my current method.

I would be glad if I could get help with an elegant method that leads to a more desirable answer. Thanks in advance

2
  • 1
    Thanks, Albe. I have updated the questions with more tags and a reformated code. Commented Oct 2, 2023 at 19:06
  • Have you tried matplotlib.pyplot.hist with density=True? You give it the raw data, and it produces a normalized histogram. Commented Nov 10, 2023 at 15:54

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.