0

I'm trying to perform a Gaussian Process Regression on the following series of data points (X and Y):

X         Y
0.00001  -6.8200000000000003
0.00014  -6.8073846153846160
0.00042  -6.6818333333333326
0.00200  0.5002400000000000
0.00545  -0.0163174311926606
0.00545  -0.0163174311926606
0.00545  -0.0163174311926606
0.00552  -0.0317916666666667
0.00604 -0.1288741721854305
0.00756  -0.2954391534391534
0.00987  -0.3925845997973657
0.02201  -0.2940145454545455
0.02618  -0.2296286259541984
0.04000  -0.1075512500000000
0.05737  -0.0801517421602787
0.05761  -0.0801819444444445
0.05762  -0.0801947916666667
0.05767  -0.0801204506065858
0.05771  -0.0801724436741768
0.05772  -0.0801856152512998
0.05773  -0.0801980935875217
0.05782  -0.0801769896193772
0.05783  -0.0801899653979239
0.05788  -0.0801169257340242
0.05789  -0.0801297063903282
0.05804  -0.0801881034482759
0.05948  -0.0800737815126051
0.07886  -0.0843219264892269
0.10000  -0.0866689000000000

I'm using GPy to optimize the hyperparameters while fixing the noise variance to 0:

GP_regression.           |                value  |  constraints  |  priors
rbf.variance             |   1.9537980554282994  |      +ve      |
rbf.lengthscale          |  0.47928874454359643  |      +ve      |
Gaussian_noise.variance  |                  0.0  |   +ve fixed   |

Then, I compute the values of Y* at each test point X* using the optimized lengthscale and variance from GPy. I normalize the data when optimizing the hyperparamters, so I normalize X* at the beginning of the function and undo the normalization at the end. I perform the inversion of the kernel matrix using Cholesky decomposition and add a very small jitter to handle matrices with linearly dependent columns:

L = cholesky(K + np.eye(len(X)) * 1e-11, lower=True)  # Add jitter for stability

# Solve for alpha (K * alpha = Y)
alpha_imag= cho_solve((L, True), Y)

def gp_surrogate_model_imag(x_star):

    x_star = np.atleast_1d(x_star)
    x_star = x_star.reshape(-1,1)
    # Normalize input
    x_star_norm = (x_star - X_mean_old) / X_std_old
    # Compute covariance vector k(X^*, X)
    k_star = kernel.K(x_star_norm, X)

    Y_surrogate = k_star @ alpha_imag

    # Convert back to original scale (denormalization)
    Y_surrogate = Y_surrogate * Y_std_old + Y_mean_old

    return Y_surrogate.item()

Since the noise variance is 0, the points of the surrogate model should coincide exactly with the training points, right?

However, when I compute the differences between Y and Y* at the training points, I obtain differences on the order of 10^-5.

Is this an issue of my implementation, or are such small differences expected even when the noise variance is fixed to 0 in GPy?

New contributor
Josep is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
1
  • What is the actual function you are trying to fit to these data? BTW The sample data looks to be woefully under sampled between x=0.00042 and x=0.00545. The residual errors of 1e-5 might be plausible if the matrix inversion is a bit tricky. Commented 29 mins ago

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.