Noiseless Gausian Proces regression not crossing the training points

Ask Question

Asked today

Modified today

Viewed 20 times

I'm trying to perform a Gaussian Process Regression on the following series of data points (X and Y):

X         Y
0.00001  -6.8200000000000003
0.00014  -6.8073846153846160
0.00042  -6.6818333333333326
0.00200  0.5002400000000000
0.00545  -0.0163174311926606
0.00545  -0.0163174311926606
0.00545  -0.0163174311926606
0.00552  -0.0317916666666667
0.00604 -0.1288741721854305
0.00756  -0.2954391534391534
0.00987  -0.3925845997973657
0.02201  -0.2940145454545455
0.02618  -0.2296286259541984
0.04000  -0.1075512500000000
0.05737  -0.0801517421602787
0.05761  -0.0801819444444445
0.05762  -0.0801947916666667
0.05767  -0.0801204506065858
0.05771  -0.0801724436741768
0.05772  -0.0801856152512998
0.05773  -0.0801980935875217
0.05782  -0.0801769896193772
0.05783  -0.0801899653979239
0.05788  -0.0801169257340242
0.05789  -0.0801297063903282
0.05804  -0.0801881034482759
0.05948  -0.0800737815126051
0.07886  -0.0843219264892269
0.10000  -0.0866689000000000

I'm using GPy to optimize the hyperparameters while fixing the noise variance to 0:

GP_regression.           |                value  |  constraints  |  priors
rbf.variance             |   1.9537980554282994  |      +ve      |
rbf.lengthscale          |  0.47928874454359643  |      +ve      |
Gaussian_noise.variance  |                  0.0  |   +ve fixed   |

Then, I compute the values of Y* at each test point X* using the optimized lengthscale and variance from GPy. I normalize the data when optimizing the hyperparamters, so I normalize X* at the beginning of the function and undo the normalization at the end. I perform the inversion of the kernel matrix using Cholesky decomposition and add a very small jitter to handle matrices with linearly dependent columns:

L = cholesky(K + np.eye(len(X)) * 1e-11, lower=True)  # Add jitter for stability

# Solve for alpha (K * alpha = Y)
alpha_imag= cho_solve((L, True), Y)

def gp_surrogate_model_imag(x_star):

    x_star = np.atleast_1d(x_star)
    x_star = x_star.reshape(-1,1)
    # Normalize input
    x_star_norm = (x_star - X_mean_old) / X_std_old
    # Compute covariance vector k(X^*, X)
    k_star = kernel.K(x_star_norm, X)

    Y_surrogate = k_star @ alpha_imag

    # Convert back to original scale (denormalization)
    Y_surrogate = Y_surrogate * Y_std_old + Y_mean_old

    return Y_surrogate.item()

Since the noise variance is 0, the points of the surrogate model should coincide exactly with the training points, right?

However, when I compute the differences between Y and Y* at the training points, I obtain differences on the order of 10^-5.

Is this an issue of my implementation, or are such small differences expected even when the noise variance is fixed to 0 in GPy?

asked 4 hours ago

Josep

New contributor

What is the actual function you are trying to fit to these data? BTW The sample data looks to be woefully under sampled between x=0.00042 and x=0.00545. The residual errors of 1e-5 might be plausible if the matrix inversion is a bit tricky.

Martin Brown
– Martin Brown

2025-11-22 09:30:42 +00:00
Commented 29 mins ago

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Noiseless Gausian Proces regression not crossing the training points

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest