0

Outline: I am using a vector autoregression (VAR) model from the statsmodel package https://www.statsmodels.org/stable/vector_ar.html#var.

My two time-series, let us call them time-series 1) ts1 and 2) ts2, are each 774 sampling points long. I train the VAR on the first 80% of the data length, and then use remaining 20% for prediciton/forecast. The optimal lag was determined using the Bayesian information criterion (BIC).

Moreover, since both time-series lacked stationarity, I first applied first-order differencing on them and subsequently also applied a linear detrending. The additional linear detrending was used because there was still a minimal linear trend left in the data after differencing.

Code: Here is an example of my code.

from statsmodels.tsa.api import VAR
import pandas as pd

df_train = pd.DataFrame({"ts2": ts2,
                         "ts1": ts1})
len_train = round(len(df_train) * 0.8)
train = df[:len_train]

# VAR train on the first 80% train data
model = VAR(df_train)
model_fit = model.fit(bic_lag) 

# Prediction/forecast on the last 20% test data
len_test = round(len(df) * 0.2)
test =  df[len_train:]

# DataFrame to 2d numpy array
test = pd.DataFrame.to_numpy(test)

# Prediction
y_pred = model_fit.forecast(y=test, steps=len_test)

# Select first column (ts2)
y_pred = y_pred[:, 0]

# True values
y_true = var2_var[len_train:]

Problem: Now, we can plot the predicted y_pred (in red color) and the real values y_true (in blue color), shown in the plot below. As we can see, the prediction y_pred quickly converges to zero and remains there.

enter image description here

Question: Here is my question as someone still new to prediction and forecasting. Why does the model simply not predict properly further than data ~20 points? What are possible reasons for the fact that the prediction converges to zero relatively quickly?

Finally, if required, I can add the two time-series here for reproducibility. I refrained from this for now because I did not want to "spam" my question with a long list of data.

2
  • 1
    Provided none finds a blatant error in your code, and your question is about correct use of a model, you might get more answers after posting in Cross-Validated or Data-Science. although, looking at tags, [statsmodel] still gets more traffic here on Stack-Overflow Commented Jun 28, 2024 at 10:41
  • This always happens when forecasting with the VAR model. VAR gives a mean for long-term forecasting, which can be computed by hand. Using a larger lag order can help if you want to forecast a bit longer. Commented Aug 11, 2024 at 13:35

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.