I am trying to optimize forecast parameters with scipy.optimize. I followed tutorial and also found some nice example here on stackoverflow but I am facing an issue that I cannot resolve. I am starting to wonder whether using pandas is a poor choice with scipy?
I have set up my code as follow:
import simpy
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import pandas as pd
import statistics as stat
import math as m
#from sklearn.grid_search import ParameterGrid
from scipy.optimize import minimize
###dataframe for the simulation
df = pd.read_csv('simulation_df_data_2018_2.csv')
with pd.option_context("max_rows", None,"max_columns", None):
print(df.head())
for i in df.index:
alpha = 0.2
beta = 0.3
x = np.array([alpha, beta])
def holts(x):
LO = np.int(df['average_demand'].loc[i])
print(type(LO))
TO = ((df['m2'].loc[i] - df['m3'].loc[i]) + (df['m1'].loc[i] - df['m2'].loc[i])) / 2
L1 = round(x[0] * df['m3'].loc[i] + (1 - x[0]) * (
LO + TO))
T1 = x[1] * (L1 - LO) + (1 - x[1]) * TO
L2 = round(x[0] * df['m2'].loc[i] + (1 - x[0]) * (
L1 + T1))
T2 = x[1] * (L2 - L1) + (1 - x[1]) * T1
L3 = round(x[0] * df['m1'].loc[i] + (1 - x[0]) * (
L2 + T2))
T3 = beta * (L3 - L2) + (1 - beta) * T2
LT1 = round(L3 + 1 * T3)
MSE = ((df['m3'].loc[i] - L1) + (df['m2'].loc[i] - L2) + (
df['m2'].loc[i] - L3)) ** 2 / 3
return MSE
#print(holts(x))
x0 = [0.1,0.1]
result = minimize(holts, x0, bounds=[(0,1),(0,1)], method="SLSQP")
print(result)
print(x)
and df looks like this:
m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 \
0 0.0 8.0 2.0 0.0 14.0 0.0 5.0 2.0 4.0 4.0 10.0
1 4.0 55.0 2.0 72.0 38.0 87.0 113.0 2.0 0.0 165.0 2.0
2 18.0 34.0 6.0 63.0 14.0 18.0 33.0 35.0 51.0 0.0 24.0
3 0.0 21.0 3.0 10.0 15.0 0.0 32.0 1.0 3.0 17.0 0.0
4 96.0 106.0 237.0 136.0 138.0 116.0 167.0 158.0 110.0 115.0 161.0
m12 m13 m14 m15 m16 m17 m18 m19 m20 m21 m22 \
0 0.0 6.0 10.0 0.0 2.0 2.0 17.0 0.0 0.0 0.0 0.0
1 35.0 7.0 88.0 6.0 3.0 103.0 18.0 59.0 6.0 20.0 152.0
2 6.0 5.0 99.0 7.0 17.0 15.0 8.0 3.0 21.0 6.0 4.0
3 30.0 5.0 88.0 1.0 6.0 10.0 9.0 17.0 9.0 0.0 1.0
4 116.0 77.0 48.0 96.0 69.0 77.0 96.0 74.0 94.0 101.0 115.0
m23 m24 average_demand low_demand high_demand
0 0.0 0.0 3.583333 0.0 17.0
1 6.0 0.0 43.458333 0.0 165.0
2 14.0 12.0 21.375000 0.0 99.0
3 0.0 0.0 11.583333 0.0 88.0
4 158.0 167.0 117.833333 48.0 237.0
I am very confused about the error I keep getting, here is the traceback
Traceback (most recent call last):
File "/Users/pierre/Desktop/simul/forecast_holts_alpha.py", line 121, in <module>
result = minimize(holts, x0, args= coef_list,bounds=[(0,1),(0,1)], method="SLSQP")
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/_minimize.py", line 618, in minimize
constraints, callback=callback, **options)
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/slsqp.py", line 399, in _minimize_slsqp
fx = func(x)
File "/Users/pierre/Desktop/Django-app/lib/python3.7/site-packages/scipy/optimize/optimize.py", line 327, in function_wrapper
return function(*(wrapper_args + args))
File "/Users/pierre/Desktop/simul/forecast_holts_alpha.py", line 63, in holts
LO = np.int(df['average_demand'].loc[i])
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
I don't get why this error is popping up there, especially because if I search the type of LO I get this:
print(type(LO))
<class 'int'>
I am not an experience programmer so I struggle to figure out what is going on, any help would be appreciated!
UPDATE:
fun: 56.333333333333336
jac: array([0., 0.])
message: 'Optimization terminated successfully.'
nfev: 4
nit: 1
njev: 1
status: 0
success: True
x: array([0.1, 0.1])
[0.2,0.3]
OUTPUT looks like this but does not seem to be optimizing anything
scipy.optimize.minimizecan be helpful. Yourholtsfunction signature is not matched with the doc, which should befun(x, *args) -> float. Sincex0 = [0.1, 0.1]in your code, I thinkxshould be 1-D array with shape (2,).fun(x, *args) -> float,xshould be the variables of the function you want to minimize. In your code, you want to minimizeholtsfunction andholts(df, coef_list)function hasdfin the place for the variablesx, which seems to be wrong. For what variables do you want to minimizeholtsfunction? Yourholtsfunction seems to have no variables.alphaandbetais the variable,xneeds to be an array withalphaandbeta.