0

I'm trying to create a Monte Carlo simulation to simulate future stock prices using Numpy arrays.

My current approach is: create a For Loop which fills an array, stock_price_array, with simulated stock prices. These stock prices are generated by taking the last stock price, then multiplying it by 1 + an annual return. The annual returns are drawn randomly from a normal distribution and stored in the array annual_ret.

My problem is that although the "stock price" variables I print from my For Loop appear to be correct, I simply cannot figure out how to Append these stock price variables to stock_price_array.

I've tried various methods, including initializing the stock_price_array using .full instead of .empty, changing the order of where the array appears in the For Loop, and checking the size of the array.

I've read other Stack Overflow posts on similar topics but can't figure out what I'm doing wrong.

Thank you in advance for your help!

annual_mean = .06
annual_stdev = .15
start_stock_price = 100

numYears = 3
numSimulations = 4
stock_price_array = np.empty(numYears)

# draw an annual return from a normal distribution; this annual return will be random
annual_ret = np.random.normal(annual_mean, annual_stdev, numSimulations)

for i in range(numYears):
    stock_price = np.multiply(start_stock_price, (1 + annual_ret[i]))
    np.append(stock_price_array, [stock_price])
    start_stock_price = stock_price


3
  • 3
    np.append creates a new array. You dont do anything with the result. Note, you shouldn't be appending to a numpy array in a loop to begin with, it is very inefficient. Instead, use a regular list then convert it to an array at the end Commented Feb 6, 2023 at 1:09
  • 1
    but, this would work if you did stock_price_array = np.append(stock_price_array, [stock_price]) Commented Feb 6, 2023 at 1:10
  • First that np.empty(n) makes an array with n elements; it is not the equivalent of the list []. Second, don't use a function like np.append without actually reading its docs. Third, np.append is a poorly name function that is ok for adding one value to a 1d list; for anything else, forget it. It is not a list append clone. Commented Feb 6, 2023 at 1:42

1 Answer 1

2

The 1st rule of numpy is: never iterate your array yourself. Use numpy function that does all the computation in batch (and for doing so, they iterate the array, sure. But that iteration is not a python iteration, so it is way faster).

No-for solution

For example, here, you could do something like this

np.cumprod(np.hstack([start_stock_price, annual_ret+1]))

What it does is 1st building an array of a initial value, and some factors. So if initial value is 100, and interest rate are 0.1, -0.1, 0.2, 0.2 (for example), then hstack build and array of values 100, 1.1, 0.9, 1.2, 1.2.

And the cumprod just build the cumulative product of those

100, 100×1.1=110, 100×1.1×0.9=110×0.9=99, 100×1.1×0.9×1.2=99×1.2=118.8, 100×1.1×0.9×1.2×1.2=118.8×1.2=142.56

Correction of yours

To answer to your initial question anyway (even if I strongly advise that you try to use solutions like the usage of cumprod I've shown), you have 2 choices:

  • Either you allocate in advance an array, as you did (your stock_price_array = np.empty(numYears)). And then, instead of trying to append the new stock_price to stock_price_array, you should simply fill one of the empty place that are already there. By simply doing stock_price_array[i] = stock_price

  • Or you don't. And then you replace the np.empty line by a stock_price_array=[]. And then, at each step, you do append the result to create a new stock_price_array, like this stock_price_array = np.append(stock_price_array, [stock_price])

I strongly advise against the 2nd solution. Since you already know the final size of the array, it is way better to create it once. Because np.append recreate a brand new array, then copies the input data it it. It does not just extend the existing array (generally speaking, we can't do that anyway).

But, well, anyway, I advise against both solution, since I find mine (with cumprod) preferable. for is the taboo word in numpy. And it is even more so, when what inside this for is the creation of a new array, like append is.

Monte-Carlo

Since you've mentioned Monte-Carlo, and then shown a code that compute only one result (you draw 1 set of annual ret, and perform one computation of future values), I am wondering if that is really what you want. In particular, I see that you have numSimulation and numYears, that appear to be playing redundant roles in your code (and therefore in mines). The only reason why it doesn't just throw a index error, is because numSimulation is used only to decide how many annual_ret you draw. And since numSimulation > numYears, you have more than enough annual_ret to compute the result.

Wasn't your initial intention to redo the simulation over the years numSimulation time, to have numSimulation results ?

In which case, you probably need numSimulation sets of numYears annual rate. So a 2D array. And like wise, you should be computing numSimulation series of numYears results.

If my guess is not completely off, I surmise that what you really wanted to do was rather in the effect of:

annual_ret = np.random.normal(annual_mean, annual_stdev, (numSimulations, numYears)) # 2d array of interest rate. 1 simulation per row, 1 year per column

t = np.pad(annual_ret+1, ((0,0), (1,0)), constant_values=start_stock_price) # Add 1 as we did earlier. And pad with an initial 100 (`start_stock_price`) at the beginning of each simulation

res = np.cumprod(t, axis=1) # cumulative multiplication. `axis=1` means that it is done along axis 1 (along years) for each row (for each simulation)
Sign up to request clarification or add additional context in comments.

10 Comments

Wow, I didn't know about assigning a 2D array. I love what you did with annual_ret. Your intuition about what I want the simulation to do is correct -- I'd been trying to take the problem in chunks, and hadn't gotten to using numSimulations. I think your approach is definitely better than mine, and I realize I shouldn't have been using a For loop. I think the code does what I want it to do, but I'm not 100% sure. Part of my confusion is that when I run plt.plot(res) I see a bunch of lines which do not look like random stock prices to me. I'd expect to see numSimulations number of lines.
Do you have any idea why graphing plt.plot(res) wouldn't show numSimulations number of lines and would instead show numYears number of lines?
Yes. It is how plot works. It is meant to be use with a 1D-array of scalar (number) as an argument. And it draws the corresponding line (curve), showing the progression of those numbers. You know that already. But you can also pass as an argument an array of pairs of numbers, and then it will draw two curves. Or an array of triplets of numbers, and then it will draw three curves. Or, etc.
Now, an array of bunch of numbers is a 2D-array. But the point is that the logic behind more than 1D arrays implies that it is the columns of the array that are each one a curve. Each rows match an abscissa (that in either the index, or, if you passed a second array for x, the matching values of x), and inside each row, all columns k are all the values for this abscissa and curve k.
So, long story short, if you want to trace n curves of length L with a single plt.plot, then you need to pass a L×n 2D-array, that is L rows, n columns. And each columns is a curve.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.