`ValueError` in Matrix Multiplication for Gaussian Process Regression Implementation

Question

I'm implementing a Gaussian Process Regression (GPR) model in Python using a Squared Exponential Kernel. However, I'm encountering a ValueError during the matrix multiplication step of the predict method, specifically when trying to compute the mean prediction.

The error I'm seeing is:

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature 
(n?,k),(k,m?)->(n?,m?) (size 10 is different from 100)

Code Details

Here’s a breakdown of the code that comes into play in this error:

import numpy as np

class SquaredExponentialKernel:
    def __init__(self, length_scale=1.0, variance=1.0):
        self.length_scale = length_scale
        self.variance = variance

    def __call__(self, x1, x2):
        dist_sq = np.sum((x1 - x2)**2)
        return self.variance * np.exp(-0.5 * dist_sq / self.length_scale**2)

def cov_matrix(x1, x2, cov_function) -> np.array:
    return np.array([[cov_function(a, b) for a in x1] for b in x2])

class GPR:
    def __init__(self, data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma: float = 0):
        self.noise = white_noise_sigma
        self.data_x = data_x
        self.data_y = data_y
        self.covariance_function = covariance_function
        self._inverse_of_covariance_matrix_of_input_noise_adj = np.linalg.inv(
            cov_matrix(data_x, data_x, covariance_function) + self.noise * np.identity(len(self.data_x))
        )
        self._memory = None

    def predict(self, test_data: np.ndarray) -> np.ndarray:
        KXX_star = cov_matrix(test_data, self.data_x, self.covariance_function)
        KX_starX_star = cov_matrix(test_data, test_data, self.covariance_function)
        mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)
        cov_test_data = KX_starX_star - KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ KXX_star.T)
        var_test_data = np.diag(cov_test_data)
        self._memory = {'mean': mean_test_data, 'covariance_matrix': cov_test_data, 'variance': var_test_data}
        return mean_test_data

# Test data
np.random.seed(69)
data_x = np.linspace(-5, 5, 10).reshape(-1, 1)
data_y = np.sin(data_x) + 0.1 * np.random.randn(10, 1)

# Instantiate and predict
gpr_se = GPR(data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma=0.1)
test_data = np.linspace(-6, 6, 100).reshape(-1, 1)
mean_predictions = gpr_se.predict(test_data)

Dimension Breakdown

Here's the dimensional analysis for the matrix multiplication where the error occurs:

KXX_star is computed as cov_matrix(test_data, self.data_x, self.covariance_function), resulting in a shape of (100, 10).
self._inverse_of_covariance_matrix_of_input_noise_adj is computed in the __init__ method and has a shape of (10, 10).
self.data_y has a shape of (10, 1).

The line in question is:

mean_test_data = KXX_star @ (self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y)

This should yield a result with shape (100, 1) because:

KXX_star has shape (100, 10),
(self._inverse_of_covariance_matrix_of_input_noise_adj @ self.data_y) results in shape (10, 1).

Why am I getting a dimension mismatch error here when the dimensions seem to align for the matrix multiplication? And how can I fix it?

I expected this matrix multiplication to work as the dimensions appear compatible on paper: KXX_star (100, 10) multiplied by (10, 1) should yield (100, 1). The error, however, states a dimension mismatch, implying something isn’t aligning as expected. I checked shapes for self.data_y, self._inverse_of_covariance_matrix_of_input_noise_adj, and KXX_star. Also tried reshaping data_y to ensure it’s consistently (10, 1), but the error persists. I was expecting to get mean predictions as a vector of shape (100, 1) for test_data without any dimension issues.

To debug this, I'd separate the two @ into separate lines, and add shape prints right before. Also I'd try to reproduce the error message with a test pair of arrays.(eg a (10,100)@(10,1)). — hpaulj
– hpaulj, Commented Nov 10, 2024 at 15:46
Are those equations for mean_test_data and cov_test_data correct? This seems to be like kriging though those equations do not seem right — Onyambu
– Onyambu, Commented Nov 11, 2024 at 10:13

hpaulj · Accepted Answer · 2024-11-10 20:15:06Z

0

This reproduces your error message

In [4]: a = np.ones((10,100)); b = np.ones((10,1)); a@b
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 a = np.ones((10,100)); b = np.ones((10,1)); a@b

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0,
with gufunc signature (n?,k),(k,m?)->(n?,m?) 
(size 10 is different from 100)

Which inidicates that KXX_star is the transpose of what you think!

In your question, it wasn't always clear which shapes were verified, and which were just 'wishes'.

edit

cov_matrix(test_data,...)

test_data size is 100, and the column dimension of the cov array,

edited Nov 10, 2024 at 20:15

answered Nov 10, 2024 at 16:57

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

hpaulj Over a year ago

Did you actually test KXX_star.shape?

Onyambu · Accepted Answer · 2024-11-12 00:40:26Z

I will use kriging functions for mu and var:

here is how you could solve the problem:

class SquaredExponentialKernel:
    def __init__(self, length_scale=1.0, variance=1.0):
        self.length_scale = length_scale
        self.variance = variance
    
    def __call__(self, x1, x2):
        x1 = np.atleast_2d(x1)
        x2 = np.atleast_2d(x2)
        dist_sq = np.square(x1[:,None] - x2).sum(2)
        # use vectorization. If not, use `cdist` from `scipy.spatial.distance` 
        #dist_sq = scipy.spatial.distance.cdist(data_x, test_data, 'sqeuclidean')
        return self.variance * np.exp(-0.5 * dist_sq / self.length_scale**2)
    


class GPR:
    def __init__(self, data_x, data_y, covariance_function=SquaredExponentialKernel(), white_noise_sigma: float = 0):
        self.noise = white_noise_sigma
        self.data_x = data_x
        self.data_y = data_y
        self.covariance_function = covariance_function
        self.kernel = covariance_function(data_x, data_x) + self.noise * np.identity(len(self.data_x))
        self._inv = np.linalg.inv(self.kernel)
        self._ones = np.ones_like(data_x).ravel()
        self._denom = self._ones @ self._inv @ self._ones
        self._mean = self._ones @ self._inv @ data_y /self._denom
        self._diff = (data_y.ravel() - self._mean)
        self._var =  self._diff @ self._inv @ self._diff / data_y.size
                        

    def predict(self, test_data: np.ndarray) -> np.ndarray:
        KXX_star = self.covariance_function(self.data_x, test_data)
        mean_test_data = self._mean + KXX_star.T @ self._inv @ self._diff
        a = (self._inv @ KXX_star)
        cov_test_data = self._var * (1 - KXX_star.T @ a + 
                                    (1 - self._ones @ a)**2/self._denom)
        var_test_data = np.diag(cov_test_data)
        self._memory = {'mean': mean_test_data, 'covariance_matrix': cov_test_data, 'variance': var_test_data}
        return self._memory
    
#%%
gpr_se = GPR(data_x, data_y, covariance_function=SquaredExponentialKernel(a.kernel_.get_params()['length_scale']), white_noise_sigma=0)
m = gpr_se.predict(test_data)

Collectives™ on Stack Overflow

`ValueError` in Matrix Multiplication for Gaussian Process Regression Implementation

Code Details

Dimension Breakdown

2 Answers 2

edit

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Code Details

Dimension Breakdown

2 Answers 2

edit

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related