I'm trying to use sklearn.GaussianProccessRegression to fit data, then estimate new points using the regression. The predictions need to be positive (preferably above a specified minimum).
I'd like to use that minimum as a constraint in the solver. The backend should be able to do it, but the way I've got it to work is slow and sometimes is treated as more of a suggestion if no theta values can meet said minimum (although it does "pull" the minimum towards the desired value, and can be remedied with a kernel change).
At any rate, thought someone might appreciate the work, and I would appreciate any pointers. Y_mean was harvested from the sklearn source code, and repurposed.
def Y_mean(x0):
gpr.kernel_.theta=x0
K = gpr.kernel_(gpr.X_train_)
K[np.diag_indices_from(K)] +=gpr.alpha
L_ = cholesky(K, lower=True, check_finite=False)
alpha_ = cho_solve(
(L_, True),
gpr.y_train_,
check_finite=False,
)
K_trans = gpr.kernel_(X, gpr.X_train_)
y_mean = K_trans @alpha_
y_mean = gpr._y_train_std * y_mean + gpr._y_train_mean
print(x0,y_mean.min())
return (y_mean.min()-gpr.y_train_.min()*.9)
def conOp(obj_func, x0, bounds):
res = minimize(
obj_func, x0, bounds=bounds, jac=True,tol=.1,
constraints=({'type':'ineq','fun':Y_mean}))
print('ans',res.x,Y_mean(res.x))
print(res)
return res.x, res.fun
In the main code, an unconstrained GPR is fit, then that is used as "gpr" object in the above functions for the constrained fit with "conOp" used as "optimizer=conOp".
I believe this to be fine since the theta_ values are from the optimizer of the constrained GPR, and all subsequent post-fit values use those optimizer controlled thetas. Also, both the constrained/unconstrained GPRs use the same input kernel/alpha values. The "tol" value of .1 typically results in a dozen iterations, whereas the default/None goes for over 300.