Using numpy.random.normal with arrays

Question

Suppose i have the following two arrays with means and standard deviations:

mu = np.array([2000, 3000, 5000, 1000])
sigma = np.array([250, 152, 397, 180])

Then:

a = np.random.normal(mu, sigma)

In [1]: a
Out[1]: array([1715.6903716 , 3028.54168667, 4731.34048645, 933.18903575])

However, if i ask for 100 draws for each element of mu, sigma:

a = np.random.normal(mu, sigma, 100)

a = np.random.normal(mu, sigma, 100)
Traceback (most recent call last):

File "<ipython-input-417-4aadd7d15875>", line 1, in <module>
a = np.random.normal(mu, sigma, 100)

File "mtrand.pyx", line 1652, in mtrand.RandomState.normal

File "mtrand.pyx", line 265, in mtrand.cont2_array

ValueError: shape mismatch: objects cannot be broadcast to a single shape

I have also tried using a tuple for size(s):

s = (100, 100, 100, 100)
a = np.random.normal(mu, sigma, s)

What am i missing?

cs95 · Accepted Answer · 2018-03-08 21:10:02Z

3

I don't believe you can control the size parameter when you pass a list/vector of values for the mean and std. Instead, you can iterate over each pair and then concatenate:

np.concatenate(
   [np.random.normal(m, s, 100) for m, s in zip(mu, sigma)]
)

This gives you a (400, ) array. If you want a (4, 100) array instead, call np.array instead of np.concatenate.

answered Mar 8, 2018 at 21:10

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user8682794 Over a year ago

Thank you. This was also my guess as the documentation is not clear on this. I was hoping i could avoid iterating using a for loop though.

cs95 Over a year ago

@user177324 Well, you can "avoid using a for loop", yes: np.array(list(map(np.random.normal, mu, sigma, [100] * len(mu)))). But if you want to know how to avoid calling the function more than once, I think it may not be possible.

user8682794 Over a year ago

Thank you, this is helpful indeed. I am just a bit worried that if i have to do this 10000 times a for loop would be considerably slower.

cs95 Over a year ago

@user177324 Yes, if you want to generate 1 million random numbers, that would indeed be slow with a loop!

pthibault · Accepted Answer · 2018-03-08 21:43:41Z

2

If you want to make only one call, the normal distribution is easy enough to shift and rescale after the fact. (I'm making up a 10000-long vector of mu and sigma from your example here):

mu = np.random.choice([2000., 3000., 5000., 1000.], 10000)               
sigma = np.random.choice([250., 152., 397., 180.], 10000)

a = np.random.normal(size=(10000, 100)) * sigma[:,None] + mu[:,None]

This works fine. You can decide if speed is an issue. On my system the following is just 50% slower:

a = np.array([np.random.normal(m, s, 100) for m,s in zip(mu, sigma)])

answered Mar 8, 2018 at 21:43

pthibault

5245 silver badges6 bronze badges

1 Comment

Thomas Wagenaar Over a year ago

This is an excellent answer! Add some information on why this works (mathematically), and it's a perfect answer.

Eliam · Accepted Answer · 2023-11-15 18:02:26Z

This is an old question but I had the same issue recently and the documentation is still not clear at present, so my answer may be useful to other people.

The thing is that if you want to draw n_sample samples from (uncorrelated) normal distributions with n_param different parameters, the size argument of the function needs to be a tuple (n_sample, n_param). Back to your example :

mu = np.array([2000, 3000, 5000, 1000])
sigma = np.array([250, 152, 397, 180])

n_sample = 10
n_param = len(mu)

np.random.normal(mu, sigma, (n_sample, n_param))

which returns

array([[2048.27840802, 2997.96810385, 4388.76381537,  834.58578664],
       [2284.62302217, 3057.37011582, 5141.42601472,  757.21437687],
       [1933.16814182, 3060.13736788, 5431.56812414,  949.80295487],
       [2444.69699622, 3049.32584965, 4850.82175943,  772.26041345],
       [2129.87928253, 2976.20614441, 5140.33783836, 1017.96741881],
       [1906.47137372, 2829.44037933, 4894.20964032, 1245.29240452],
       [2031.94886175, 2693.19106648, 5385.33674047,  849.72485587],
       [2034.22639971, 3017.86916011, 5050.08920701, 1198.48286148],
       [2278.8297283 , 3036.31308636, 5043.93694099,  988.87438521],
       [1760.04486593, 2875.0750094 , 4615.1775128 ,  946.76458665]])

Collectives™ on Stack Overflow

Using numpy.random.normal with arrays

3 Answers 3

4 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related