0

I am working on a regression task where my feature matrix consists of two features: a linear term and its square (quadratic feature). My model is predicting values correctly, but after numerous attempts I noticed that to visualize the predictions, only the first column of my feature matrix (i.e., the linear feature) needs to be plotted against the predictions.

Here’s the relevant part of my code

X_custom_1 = np.arange(-5, 5, 0.01).reshape(1000, 1)
X_custom_2 = X_custom_1**2
X_custom = np.append(X_custom_1, X_custom_2, axis=1)
y_pred = model.predict(X_custom)

# Debugging
print(X_custom_1.shape, X_custom_2.shape)  # Output: (1000, 1) (1000, 1)
print(X_custom.shape, y_pred.shape)        # Output: (1000, 2) (1000, 1)

# Plotting
_, ax = plt.subplots(ncols=2, figsize=(19, 7))
ax[0].scatter(X, y)  # Original data: (50, 1) (50, )
ax[0].plot(X_custom[:, 0], y_pred, color='red')  # Model predictions

In the ax[0].plot() line, I am plotting X_custom[:, 0] (the linear feature) against y_pred which can be seen below:

enter image description here

My question is: Why does it make sense to plot the predictions against only the first feature of the input matrix, rather than using all features? Is it because a line plot inherently works only for a single feature?

Any clarification is appreciated.

2
  • Welcome to stack overflow! What is the "original data" X and y? is it the same as X_custom_1 and X_custom_2? To further clarify your question, it would be nice to add an image of the plot you are getting. At the moment, I would think that you are getting a quadratic scatter plot from -5 to 5 and two line plots, one quadratic, one linear. y_pred[:,0] vs y_pred[:,1] might be the plot you are looking for, but it depends on what X and y are. Commented Jan 7 at 6:13
  • By "original data" I can safely assume that you are making reference to the data I have used for fitting/training, which is of course different from X_custom_1 and X_custom_2. In a bit I will adjust my post accordingly to also reveal information about plotting. Commented Jan 7 at 19:03

1 Answer 1

1

Your model only has one independent variable, i.e., both of your features are only functions of a single variable, X_custom_1: feature1(X_custom_1) = X_custom_1 and feature2(X_custom_1) = X_custom_1 ** 2. So, it makes sense to plot your output prediction against that single independent variable, which in your case is that same as your first feature. You could equally plot against either one of your features if you wanted, it just depends what relation you want to show!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.