Why does my forecast/prediction using a VAR in statsmodels quickly converge to zero?

Ask Question

Asked 1 year, 4 months ago

Modified 1 year, 4 months ago

Viewed 148 times

Outline: I am using a vector autoregression (VAR) model from the statsmodel package https://www.statsmodels.org/stable/vector_ar.html#var.

My two time-series, let us call them time-series 1) ts1 and 2) ts2, are each 774 sampling points long. I train the VAR on the first 80% of the data length, and then use remaining 20% for prediciton/forecast. The optimal lag was determined using the Bayesian information criterion (BIC).

Moreover, since both time-series lacked stationarity, I first applied first-order differencing on them and subsequently also applied a linear detrending. The additional linear detrending was used because there was still a minimal linear trend left in the data after differencing.

Code: Here is an example of my code.

from statsmodels.tsa.api import VAR
import pandas as pd

df_train = pd.DataFrame({"ts2": ts2,
                         "ts1": ts1})
len_train = round(len(df_train) * 0.8)
train = df[:len_train]

# VAR train on the first 80% train data
model = VAR(df_train)
model_fit = model.fit(bic_lag) 

# Prediction/forecast on the last 20% test data
len_test = round(len(df) * 0.2)
test =  df[len_train:]

# DataFrame to 2d numpy array
test = pd.DataFrame.to_numpy(test)

# Prediction
y_pred = model_fit.forecast(y=test, steps=len_test)

# Select first column (ts2)
y_pred = y_pred[:, 0]

# True values
y_true = var2_var[len_train:]

Problem: Now, we can plot the predicted y_pred (in red color) and the real values y_true (in blue color), shown in the plot below. As we can see, the prediction y_pred quickly converges to zero and remains there.

Question: Here is my question as someone still new to prediction and forecasting. Why does the model simply not predict properly further than data ~20 points? What are possible reasons for the fact that the prediction converges to zero relatively quickly?

Finally, if required, I can add the two time-series here for reproducibility. I refrained from this for now because I did not want to "spam" my question with a long list of data.

edited Jun 28, 2024 at 12:22

asked Jun 28, 2024 at 9:38

Philipp

4153 gold badges8 silver badges18 bronze badges

1

Provided none finds a blatant error in your code, and your question is about correct use of a model, you might get more answers after posting in Cross-Validated or Data-Science. although, looking at tags, [statsmodel] still gets more traffic here on Stack-Overflow

OCa
– OCa

2024-06-28 10:41:38 +00:00
Commented Jun 28, 2024 at 10:41
This always happens when forecasting with the VAR model. VAR gives a mean for long-term forecasting, which can be computed by hand. Using a larger lag order can help if you want to forecast a bit longer.

younggeun
– younggeun

2024-08-11 13:35:01 +00:00
Commented Aug 11, 2024 at 13:35

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Why does my forecast/prediction using a VAR in statsmodels quickly converge to zero?

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest