0

I'm implementing a Gaussian Process Regression (GPR) model in Python using a Squared Exponential Kernel. However, I'm encountering a ValueError during the matrix multiplication step of the predict method, specifically when trying to compute the mean prediction.

The error I'm seeing is:

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature 
(n?,k),(k,m?)->(n?,m?) (size 10 is different from 100)

Code Details

Here’s a breakdown of the code that comes into play in this error:

import numpy as np

class SquaredExponentialKernel:
    def __init__(self, length_scale=1.0, variance=1.0):
        self.length_scale = length_scale
        self.variance = variance

    def __call__(self, x1, x2):
        dist_sq = np.sum((x1 - x2)**2)
        return self.variance * np.exp(-0.5 * dist_sq / self.length_scale**2)

def cov_matrix(x1, x2, cov_function) -> np.array:
    return np.array([[cov_function(a, b) for a in x1] for b in x2])

class GPR:
    def __init__(self, data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma: float = 0):
        self.noise = white_noise_sigma
        self.data_x = data_x
        self.data_y = data_y
        self.covariance_function = covariance_function
        self._inverse_of_covariance_matrix_of_input_noise_adj = np.linalg.inv(
            cov_matrix(data_x, data_x, covariance_function) + self.noise * np.identity(len(self.data_x))
        )
        self._memory = None

    def predict(self, test_data: np.ndarray) -> np.ndarray:
        KXX_star = cov_matrix(test_data, self.data_x, self.covariance_function)
        KX_starX_star = cov_matrix(test_data, test_data, self.covariance_function)
        mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)
        cov_test_data = KX_starX_star - KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ KXX_star.T)
        var_test_data = np.diag(cov_test_data)
        self._memory = {'mean': mean_test_data, 'covariance_matrix': cov_test_data, 'variance': var_test_data}
        return mean_test_data

# Test data
np.random.seed(69)
data_x = np.linspace(-5, 5, 10).reshape(-1, 1)
data_y = np.sin(data_x) + 0.1 * np.random.randn(10, 1)

# Instantiate and predict
gpr_se = GPR(data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma=0.1)
test_data = np.linspace(-6, 6, 100).reshape(-1, 1)
mean_predictions = gpr_se.predict(test_data)

Dimension Breakdown

Here's the dimensional analysis for the matrix multiplication where the error occurs:

  1. KXX_star is computed as cov_matrix(test_data, self.data_x, self.covariance_function), resulting in a shape of (100, 10).
  2. self._inverse_of_covariance_matrix_of_input_noise_adj is computed in the __init__ method and has a shape of (10, 10).
  3. self.data_y has a shape of (10, 1).

The line in question is:

mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)

This should yield a result with shape (100, 1) because:

  • KXX_star has shape (100, 10),
  • (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y) results in shape (10, 1).

Why am I getting a dimension mismatch error here when the dimensions seem to align for the matrix multiplication? And how can I fix it?

I expected this matrix multiplication to work as the dimensions appear compatible on paper: KXX_star (100, 10) multiplied by (10, 1) should yield (100, 1). The error, however, states a dimension mismatch, implying something isn’t aligning as expected. I checked shapes for self.data_y, self._inverse_of_covariance_matrix_of_input_noise_adj, and KXX_star. Also tried reshaping data_y to ensure it’s consistently (10, 1), but the error persists. I was expecting to get mean predictions as a vector of shape (100, 1) for test_data without any dimension issues.

2
  • To debug this, I'd separate the two @ into separate lines, and add shape prints right before. Also I'd try to reproduce the error message with a test pair of arrays.(eg a (10,100)@(10,1)). Commented Nov 10, 2024 at 15:46
  • Are those equations for mean_test_data and cov_test_data correct? This seems to be like kriging though those equations do not seem right Commented Nov 11, 2024 at 10:13

2 Answers 2

0

This reproduces your error message

In [4]: a = np.ones((10,100)); b = np.ones((10,1)); a@b
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 a = np.ones((10,100)); b = np.ones((10,1)); a@b

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0,
with gufunc signature (n?,k),(k,m?)->(n?,m?) 
(size 10 is different from 100)

Which inidicates that KXX_star is the transpose of what you think!

In your question, it wasn't always clear which shapes were verified, and which were just 'wishes'.

edit

cov_matrix(test_data,...)

test_data size is 100, and the column dimension of the cov array,

Sign up to request clarification or add additional context in comments.

1 Comment

Did you actually test KXX_star.shape?
0

I will use kriging functions for mu and var:

here is how you could solve the problem:

class SquaredExponentialKernel:
    def __init__(self, length_scale=1.0, variance=1.0):
        self.length_scale = length_scale
        self.variance = variance
    
    def __call__(self, x1, x2):
        x1 = np.atleast_2d(x1)
        x2 = np.atleast_2d(x2)
        dist_sq = np.square(x1[:,None] - x2).sum(2)
        # use vectorization. If not, use `cdist` from `scipy.spatial.distance` 
        #dist_sq = scipy.spatial.distance.cdist(data_x, test_data, 'sqeuclidean')
        return self.variance * np.exp(-0.5 * dist_sq / self.length_scale**2)
    


class GPR:
    def __init__(self, data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma: float = 0):
        self.noise = white_noise_sigma
        self.data_x = data_x
        self.data_y = data_y
        self.covariance_function = covariance_function
        self.kernel = covariance_function(data_x, data_x) + self.noise * np.identity(len(self.data_x))
        self._inv = np.linalg.inv(self.kernel)
        self._ones = np.ones_like(data_x).ravel()
        self._denom = self._ones @ self._inv @ self._ones
        self._mean = self._ones @ self._inv @ data_y /self._denom
        self._diff = (data_y.ravel() - self._mean)
        self._var =  self._diff @ self._inv @ self._diff / data_y.size
                        

    def predict(self, test_data: np.ndarray) -> np.ndarray:
        KXX_star = self.covariance_function(self.data_x, test_data)
        mean_test_data = self._mean + KXX_star.T @ self._inv @ self._diff
        a = (self._inv @ KXX_star)
        cov_test_data = self._var * (1 - KXX_star.T @ a + 
                                    (1 - self._ones @ a)**2/self._denom)
        var_test_data = np.diag(cov_test_data)
        self._memory = {'mean': mean_test_data, 'covariance_matrix': cov_test_data, 'variance': var_test_data}
        return self._memory
    
#%%
gpr_se = GPR(data_x, data_y, covariance_function=SquaredExponentialKernel(a.kernel_.get_params()['length_scale']), white_noise_sigma=0)
m = gpr_se.predict(test_data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.