I'm implementing a Gaussian Process Regression (GPR) model in Python using a Squared Exponential Kernel. However, I'm encountering a ValueError during the matrix multiplication step of the predict method, specifically when trying to compute the mean prediction.
The error I'm seeing is:
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature
(n?,k),(k,m?)->(n?,m?) (size 10 is different from 100)
Code Details
Here’s a breakdown of the code that comes into play in this error:
import numpy as np
class SquaredExponentialKernel:
def __init__(self, length_scale=1.0, variance=1.0):
self.length_scale = length_scale
self.variance = variance
def __call__(self, x1, x2):
dist_sq = np.sum((x1 - x2)**2)
return self.variance * np.exp(-0.5 * dist_sq / self.length_scale**2)
def cov_matrix(x1, x2, cov_function) -> np.array:
return np.array([[cov_function(a, b) for a in x1] for b in x2])
class GPR:
def __init__(self, data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma: float = 0):
self.noise = white_noise_sigma
self.data_x = data_x
self.data_y = data_y
self.covariance_function = covariance_function
self._inverse_of_covariance_matrix_of_input_noise_adj = np.linalg.inv(
cov_matrix(data_x, data_x, covariance_function) + self.noise * np.identity(len(self.data_x))
)
self._memory = None
def predict(self, test_data: np.ndarray) -> np.ndarray:
KXX_star = cov_matrix(test_data, self.data_x, self.covariance_function)
KX_starX_star = cov_matrix(test_data, test_data, self.covariance_function)
mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)
cov_test_data = KX_starX_star - KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ KXX_star.T)
var_test_data = np.diag(cov_test_data)
self._memory = {'mean': mean_test_data, 'covariance_matrix': cov_test_data, 'variance': var_test_data}
return mean_test_data
# Test data
np.random.seed(69)
data_x = np.linspace(-5, 5, 10).reshape(-1, 1)
data_y = np.sin(data_x) + 0.1 * np.random.randn(10, 1)
# Instantiate and predict
gpr_se = GPR(data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma=0.1)
test_data = np.linspace(-6, 6, 100).reshape(-1, 1)
mean_predictions = gpr_se.predict(test_data)
Dimension Breakdown
Here's the dimensional analysis for the matrix multiplication where the error occurs:
KXX_staris computed ascov_matrix(test_data, self.data_x, self.covariance_function), resulting in a shape of(100, 10).self._inverse_of_covariance_matrix_of_input_noise_adjis computed in the__init__method and has a shape of(10, 10).self.data_yhas a shape of(10, 1).
The line in question is:
mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)
This should yield a result with shape (100, 1) because:
KXX_starhas shape(100, 10),(self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)results in shape(10, 1).
Why am I getting a dimension mismatch error here when the dimensions seem to align for the matrix multiplication? And how can I fix it?
I expected this matrix multiplication to work as the dimensions appear compatible on paper: KXX_star (100, 10) multiplied by (10, 1) should yield (100, 1). The error, however, states a dimension mismatch, implying something isn’t aligning as expected. I checked shapes for self.data_y, self._inverse_of_covariance_matrix_of_input_noise_adj, and KXX_star. Also tried reshaping data_y to ensure it’s consistently (10, 1), but the error persists. I was expecting to get mean predictions as a vector of shape (100, 1) for test_data without any dimension issues.
@into separate lines, and add shape prints right before. Also I'd try to reproduce the error message with a test pair of arrays.(eg a(10,100)@(10,1)).mean_test_dataandcov_test_datacorrect? This seems to be like kriging though those equations do not seem right