Using qKnowledgeGradient with SaasFullyBayesianSingleTaskGP gives shape mismatch error

Ask Question

Asked 3 months ago

Modified 3 months ago

Viewed 99 times

I’m trying to use qKnowledgeGradient with a fully Bayesian SAAS (Sparse Axis Aligned Subspace) GP (Gaussian Process) (using SaasFullyBayesianSingleTaskGP) in BoTorch. I'm doing so by writing a new class that inherits from both SaasFullyBayesianSingleTaskGP and FantasizeMixin. Then, I override the fantasize() method to define how fantasy data is generated for this model. I have used 256 samples, warmup of 512, thinning of 1, num_fantasies=2. However, on running the code, I keep getting a shape mismatch error even with raw_samples=1 and num_restarts=1. The error looks like this:

RuntimeError: shape '[2, 1, 16, 1]' is invalid for input of size 64

I created a custom SAAS GP by inheriting from both SaasFullyBayesianSingleTaskGP and FantasizeMixin, and I overrode the fantasize() method. I then attempted to use this model with qKnowledgeGradient by setting num_fantasies=2 and reducing raw_samples and num_restarts to 1 (so only a single t‑batch is used). I expected the acquisition to evaluate successfully and produce a candidate point, but instead, KG fails with the broadcast/reshape error above.

The error occurs regardless of whether I use the default KG (Knowledge Gradient) implementation or a custom KG that loops over the batch dimension and manually averages over the ensemble and I haven’t been able to eliminate it by changing collapsing batch dimensions also. I even printed the tensor dimension and seems okay to me.

Below is a minimal version of my code to reproduce the issue.

Reproducible Minimal Example Branin function embedded in 100D

lb = np.hstack((-5 * np.ones(50), 0 * np.ones(50)))
ub = np.hstack((10 * np.ones(50), 15 * np.ones(50)))

def branin100(x):
    assert (x <= ub).all() and (x >= lb).all()
    x1, x2 = x[19], x[64]
    t1 = x2 - 5.1 / (4 * math.pi ** 2) * x1 ** 2 + 5 / math.pi * x1 - 6
    t2 = 10 * (1 - 1 / (8 * math.pi)) * np.cos(x1)
    return t1 ** 2 + t2 + 10

SAAS GP with custom fantasize() method

class SaasFullyBayesianSingleTaskGPWithFantasy(SaasFullyBayesianSingleTaskGP, FantasizeMixin):
    def fantasize(
        self,
        X: torch.Tensor,
        sampler: Optional[MCSampler] = None,
        num_fantasies: int = 2,
        **kwargs,
    ) -> Model:
        if sampler is None:
            sampler = SobolQMCNormalSampler(
                sample_shape=torch.Size([num_fantasies]),
                collapse_batch_dims=True,
            )
        X = torch.as_tensor(
            X, dtype=self.train_inputs[0].dtype, device=self.train_inputs[0].device
        )
        return FantasizeMixin.fantasize(self, X, sampler=sampler, **kwargs)

Running SAASBO with KG


def run_saasbo_botorch():
    torch.manual_seed(0)
    dtype = torch.double
    device = "cpu"
    dim = 100
    lb_torch = torch.zeros(dim, dtype=dtype)
    ub_torch = torch.ones(dim, dtype=dtype)
    bounds = torch.stack([lb_torch, ub_torch])

    def f(x): return branin100(x)

    # Initial Sobol samples
    sobol = SobolEngine(dim, scramble=True, seed=0)
    X = sobol.draw(4).to(dtype=dtype)  # 4 initial points
    Y = torch.tensor(
        [f(lb + (ub - lb) * x.cpu().numpy()) for x in X],
        dtype=dtype
    ).unsqueeze(-1)

    train_Y = (Y - Y.mean()) / Y.std()

    # Fit SAAS GP
    model = SaasFullyBayesianSingleTaskGPWithFantasy(X, train_Y)
    fit_fully_bayesian_model_nuts(
        model, warmup_steps=512, num_samples=256, thinning=16
    )

    # Define posterior transform
    weights = torch.ones(2, dtype=dtype) / 2
    post_tf = ScalarizedPosteriorTransform(weights=weights)

    # Define KG acquisition
    qkg = qKnowledgeGradient(
        model=model,
        num_fantasies=2,
        current_value=train_Y.min(),
        posterior_transform=post_tf,
    )

    # Optimize acquisition
    candidate, _ = optimize_acqf(
        acq_function=qkg,
        bounds=bounds,
        q=1,
        raw_samples=1,
        num_restarts=1,
    )

run_saasbo_botorch()

Error

RuntimeError: shape '[2, 1, 16, 1]' is invalid for input of size 64

I don't understand why I keep getting this error and where it is coming from. Any guidance on what might be causing this and how to properly structure the fantasy model in this context would be greatly appreciated!

Thanks in advance.

EDIT: I overrode condition_on_observations and changed num fantasies to 64,( code below ) but now I get another error :

Output shape not equal to that of weights. Output shape is 1 and weights are torch.Size([64]

Code for condition_on_observation -:

def condition_on_observations(self, X: torch.Tensor, Y: torch.Tensor, **kwargs):
model_batch_ndim = len(self.batch_shape)

    if X.ndim == 2 and Y.ndim == 2:
        X = X.repeat(self.batch_shape + (1, 1)).contiguous()
        Y = Y.repeat(self.batch_shape + (1, 1)).contiguous()
        return super().condition_on_observations(X, Y, **kwargs)

    
    start_idx = Y.ndim - (2 + model_batch_ndim)
    model_batch_indices = list(range(start_idx, start_idx + model_batch_ndim))
    extra_indices = list(range(0, start_idx))
    remaining_indices = list(range(start_idx + model_batch_ndim, Y.ndim - 2))
    permute_order = model_batch_indices + extra_indices + remaining_indices + [Y.ndim - 2, Y.ndim - 1]
    Y_perm = Y.permute(*permute_order).contiguous()
    

    
    if X.shape[:model_batch_ndim] != self.batch_shape:
        X = X.expand(self.batch_shape + X.shape[-2:])  

   
    extra_dims = len(extra_indices) + len(remaining_indices)
    for _ in range(extra_dims):
        X = X.unsqueeze(model_batch_ndim)

    
    expand_shape = list(X.shape)
    for i in range(extra_dims):
        expand_shape[model_batch_ndim + i] = Y_perm.shape[model_batch_ndim + i]
    X_expanded = X.expand(*expand_shape).contiguous()

    
    flat_size = int(torch.tensor(Y_perm.shape[model_batch_ndim:-1]).prod())
    X_flat = X_expanded.reshape(*X_expanded.shape[:model_batch_ndim], flat_size, X_expanded.shape[-1]).clone()
    Y_flat = Y_perm.reshape(*Y_perm.shape[:model_batch_ndim], flat_size, Y_perm.shape[-1]).clone()

    return super().condition_on_observations(X_flat, Y_flat, **kwargs)

edited Jul 26 at 21:53

asked Jul 26 at 21:44

Helena

211 silver badge3 bronze badges

I would recommend reverting to the code in your original question and put some print/logging in at various points to try and narrow down where the original shape error is happening.

paisanco
– paisanco

2025-07-27 15:06:19 +00:00
Commented Jul 27 at 15:06
@paisanco Thank you for your reply. I’ve tried using print statements, but the issue seems deeper and I think comes from AbstractFullyBayesianSingleTaskGP.condition_on_observations, which doesn’t correctly handle the three batch dims: fantasy_batch x raw_samples x acq_batch. As I understand, FantasizeMixin.fantasize samples from the posterior (Y_fantasized = sampler(post_X)) and then calls condition_on_observations(X, Y_fantasized). Continued in next comment

Helena
– Helena

2025-07-27 15:30:47 +00:00
Commented Jul 27 at 15:30
@paisanco For fully Bayesian models, Y_fantasized ends up with extra batch dims for fantasies, raw MCMC samples, and acquisition. The default condition_on_observations handles only simple cases, like when X and Y are 2D or when X.ndim < Y.ndim, using a naive repeat(*(Y.shape[:-2] + (1,1))). That breaks with 3 batch dims. Even after inserting/expanding dims manually, I still get the t-batch error. That's why I am very confused and lost.

Helena
– Helena

2025-07-27 15:31:01 +00:00
Commented Jul 27 at 15:31
I see what you are saying, you are trying to override AbstractFullyBayesianSingleTaskGP.condition_on_observations to suit your use case, I'd still try to narrow it down further and see if the error is deep within the call structure. I don't have Pytorch setup at the moment so can't replicate your process.

paisanco
– paisanco

2025-07-27 15:44:55 +00:00
Commented Jul 27 at 15:44
@paisanco I am trying to narrow it down further but I don't enough understanding of BoTorch's model internals to fully debug this on my own. I will still keep trying but if anyone can help, I will really appreciate it. Thanks!

Helena
– Helena

2025-07-27 16:00:49 +00:00
Commented Jul 27 at 16:00

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Using qKnowledgeGradient with SaasFullyBayesianSingleTaskGP gives shape mismatch error

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest