1

In Pytorch's MultiHeadAttention implementation, regarding in_proj_weight, is it true that the first embed_dim elements correspond to the query, the next embed_dim elements correspond to the key, and the final embed_dim elements correspond to the value? Just confirming.

This is a question asked in the same context, but doesn't answer my specific question

1 Answer 1

2

Yes, that is the case.

You can see how in_proj_weight is used in the _in_projection_packed function

projection weights for q, k and v, packed into a single tensor. Weights
are packed along dimension 0, in q, k, v order.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.