0

I am reading from a large file that has data. I want to plot the PDF of the data but in Python.

This is the part of my code related to the question (I know it isn't helpful but I can't upload the files as they're huge).

ax_1 = pl.subplot(2,2,3)
y = norm.pdf(bins, Nat_Coronary_Mean, Nat_Coronary_std)
l = pl.plot(bins, y, 'r-', linewidth=2.5)
x_ticks_1 = np.arange(-13.*Nat_Coronary_std, 13.*Nat_Coronary_std, Nat_Coronary_std)                                  
x_labels_1 = [r"${} \sigma$".format(i) for i in range(-13,10)]                       
ax_1.set_xticks(x_ticks_1)                                                           
ax_1.set_xticklabels(x_labels_1)  
pl.title('Nat Cor Tau PDF: Mean '+str(Nat_Coronary_Mean)+' and sigma '+str(Nat_Coronary_std)+'',fontsize=11)

Does norm.pdf mean that it is transforming the distribution into a gaussian one? Or is it normalizing the data? (I understand that the area under the PDF curve is equal to 1) I am just confused about why there is no "pdf" option alone and what the "norm" is for.

6
  • It is using the Gaussian one, usually called the "normal". A normal distribution is defined with two parameters, the mean and std of the distribution. You always need to determine which distribution family you are selecting when you talk of working with the PDF. Commented Nov 10, 2015 at 23:26
  • Okay thank you. How can I just plot a PDF without the norm? I couldn't find anything in the literature about just a PDF. Any ideas? Commented Nov 10, 2015 at 23:34
  • Err...a probability distribution is simply the probability at each value of your dependent variable. If it is not normal, i.e., 68% of the outcomes are going to happen within one std of the mean of all the values, do you have the probability already calculated? You could measure and then plot out the relative frequency (i.e., normalize it (different normal)) and plot that. A PDF is using a predefined function to model the probability. If you want a PDF, you have to choose a function, and if you don't want the Gaussian, then what is it? t-test, chi-square, exponential, etc Commented Nov 10, 2015 at 23:41
  • Here is something for you to watch. Its only for a discrete probability distribution, but it should give you the gist. Commented Nov 10, 2015 at 23:44
  • Yes I understand what a PDF is but PDF in general isn't only made for Gaussian distribution. It can have a lot of random shapes and I want to see how my distribution looks like without transforming it into a gaussian one. I have a set of data that I got from experiments and I am trying to analyze them. So I want to check the overall histograms and pdf's before I make conclusions. Commented Nov 11, 2015 at 0:14

2 Answers 2

1

I want to check the overall histograms and pdf's before I make conclusions.

You can plot histograms using pyplot:

import matplotlib.pyplot as plt
plt.hist(data, bins=100)
plt.show()
Sign up to request clarification or add additional context in comments.

Comments

0

After some back and forth, I will offer this as an answer to the original question. This is a list of the PDFs available in the latest scipy.stats. You will be able to generate a theoretical PDF with each of these.

continuous distributions

  • alpha An alpha continuous random variable.
  • anglit An anglit continuous random variable.
  • arcsine An arcsine continuous random variable.
  • beta A beta continuous random variable.
  • betaprime A beta prime continuous random variable.
  • bradford A Bradford continuous random variable.
  • burr A Burr continuous random variable.
  • cauchy A Cauchy continuous random variable.
  • chi A chi continuous random variable.
  • chi2 A chi-squared continuous random variable.
  • cosine A cosine continuous random variable.
  • dgamma A double gamma continuous random variable.
  • dweibull A double Weibull continuous random variable.
  • erlang An Erlang continuous random variable.
  • expon An exponential continuous random variable.
  • exponnorm An exponentially modified Normal continuous random variable.
  • exponweib An exponentiated Weibull continuous random variable.
  • exponpow An exponential power continuous random variable.
  • f An F continuous random variable.
  • fatiguelife A fatigue-life (Birnbaum-Saunders) continuous random variable.
  • fisk A Fisk continuous random variable.
  • foldcauchy A folded Cauchy continuous random variable.
  • foldnorm A folded normal continuous random variable.
  • frechet_r A Frechet right (or Weibull minimum) continuous random variable.
  • frechet_l A Frechet left (or Weibull maximum) continuous random variable.
  • genlogistic A generalized logistic continuous random variable.
  • gennorm A generalized normal continuous random variable.
  • genpareto A generalized Pareto continuous random variable.
  • genexpon A generalized exponential continuous random variable.
  • genextreme A generalized extreme value continuous random variable.
  • gausshyper A Gauss hypergeometric continuous random variable.
  • gamma A gamma continuous random variable.
  • gengamma A generalized gamma continuous random variable.
  • genhalflogistic A generalized half-logistic continuous random variable.
  • gilbrat A Gilbrat continuous random variable.
  • gompertz A Gompertz (or truncated Gumbel) continuous random variable.
  • gumbel_r A right-skewed Gumbel continuous random variable.
  • gumbel_l A left-skewed Gumbel continuous random variable.
  • halfcauchy A Half-Cauchy continuous random variable.
  • halflogistic A half-logistic continuous random variable.
  • halfnorm A half-normal continuous random variable.
  • halfgennorm The upper half of a generalized normal continuous random variable.
  • hypsecant A hyperbolic secant continuous random variable.
  • invgamma An inverted gamma continuous random variable.
  • invgauss An inverse Gaussian continuous random variable.
  • invweibull An inverted Weibull continuous random variable.
  • johnsonsb A Johnson SB continuous random variable.
  • johnsonsu A Johnson SU continuous random variable.
  • ksone General Kolmogorov-Smirnov one-sided test.
  • kstwobign Kolmogorov-Smirnov two-sided test for large N.
  • laplace A Laplace continuous random variable.
  • logistic A logistic (or Sech-squared) continuous random variable.
  • loggamma A log gamma continuous random variable.
  • loglaplace A log-Laplace continuous random variable.
  • lognorm A lognormal continuous random variable.
  • lomax A Lomax (Pareto of the second kind) continuous random variable.
  • maxwell A Maxwell continuous random variable.
  • mielke A Mielke’s Beta-Kappa continuous random variable.
  • nakagami A Nakagami continuous random variable.
  • ncx2 A non-central chi-squared continuous random variable.
  • ncf A non-central F distribution continuous random variable.
  • nct A non-central Student’s T continuous random variable.
  • norm A normal continuous random variable.
  • pareto A Pareto continuous random variable.
  • pearson3 A pearson type III continuous random variable.
  • powerlaw A power-function continuous random variable.
  • powerlognorm A power log-normal continuous random variable.
  • powernorm A power normal continuous random variable.
  • rdist An R-distributed continuous random variable.
  • reciprocal A reciprocal continuous random variable.
  • rayleigh A Rayleigh continuous random variable.
  • rice A Rice continuous random variable.
  • recipinvgauss A reciprocal inverse Gaussian continuous random variable.
  • semicircular A semicircular continuous random variable.
  • t A Student’s T continuous random variable.
  • triang A triangular continuous random variable.
  • truncexpon A truncated exponential continuous random variable.
  • truncnorm A truncated normal continuous random variable.
  • tukeylambda A Tukey-Lamdba continuous random variable.
  • uniform A uniform continuous random variable.
  • vonmises A Von Mises continuous random variable.
  • wald A Wald continuous random variable.
  • weibull_min A Frechet right (or Weibull minimum) continuous random variable.
  • weibull_max A Frechet left (or Weibull maximum) continuous random variable.
  • wrapcauchy A wrapped Cauchy continuous random variable.
  • Multivariate distributions
  • multivariate_normal A multivariate normal random variable.
  • dirichlet A Dirichlet random variable.
  • wishart A Wishart random variable.
  • invwishart An inverse Wishart random variable.

Discrete distributions

  • bernoulli A Bernoulli discrete random variable.
  • binom A binomial discrete random variable.
  • boltzmann A Boltzmann (Truncated Discrete Exponential) random variable.
  • dlaplace A Laplacian discrete random variable.
  • geom A geometric discrete random variable.
  • hypergeom A hypergeometric discrete random variable.
  • logser A Logarithmic (Log-Series, Series) discrete random variable.
  • nbinom A negative binomial discrete random variable.
  • planck A Planck discrete exponential random variable.
  • poisson A Poisson discrete random variable.
  • randint A uniform discrete random variable.
  • skellam A Skellam discrete random variable.
  • zipf A Zipf discrete random variable.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.