I have a for loop that I would like to vectorize with numpy. In the below snippet, R, A, and done are numpy arrays of length num_rows, while Q and Q1 are matrices of size (num_rows, num_cols). Also worth noting, all elements of A are between 0 and num_cols - 1, and all elements of done are either 0 or 1. I basically want to do the same thing as the below for-loop, but taking advantage of numpy vectorization.
Important Info:
Ris a numpy array of lengthnum_rows. Arbitrary valuesAis a numpy array of lengthnum_rows. Values can be integers between 0 andnum_cols - 1doneis a numpy array of lengthnum_rows. Values are either 0 or 1Qis a 2D numpy array with shape(num_rows, num_cols)Q1is also a 2D numpy array with shape(num_rows, num_cols)
Here is the loop:
y = np.zeros((num_rows, num_cols))
for i in range(num_rows):
r = R[i]
a = A[i]
q = Q[i]
adjustment = r
if not done[i]:
adjustment += (gamma*max(Q1[i]))
q[a] = adjustment
y[i, :] = q
I think that I have gotten my "adjustments" in a vectorized way with the following lines, I just need to do the assignment to the Q matrix and output the correct y matrix.
These are the lines that I am using to vectorize the first part:
q_max_adjustments = np.multiply(gamma * Q1.max(1), done) # This would be numpy array of length num_rows
true_adjustments = R + q_max_adjustments # Same dimension numpy array
An example input and output would be
gamma = 0.99
R = numpy.array([1,2,0,3,2])
A = numpy.array([0,2,0,1,1])
done = numpy.array([0,1,0,0,1])
Q = numpy.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])
Q1 = numpy.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])
output y should be array([[ 3.97, 2. , 3. ],
[ 4. , 5. , 2. ],
[ 8.91, 8. , 9. ],
[10. , 14.88, 12. ],
[13. , 2. , 15. ]])
EDIT
So I think that I hacked something together that works, using sparse matrices as masks and such... But it seems like this probably isn't particularly performant, given the number of steps required. Is there a more efficient way to achieve the same goal? Code is below
q_max_adjustments = np.multiply(gamma * Q1.max(1), 1-done)
true_adjustments = R + q_max_adjustments
mask = np.full((num_rows, num_cols), False)
mask[np.arange(num_rows), A] = True
value_mask = np.multiply(np.vstack(true_adjustments), mask)
np.copyto(Q, value_mask, where=mask)
Q,y,q_max_adjustments, andtrue_adjustments? Can you provide example inputs for the minimal reproducible example? Aretrue_adjustmentsandq_max_adjustmentscorrect?true_adjustmentsandq_max_adjustmentsare correct.q_max_adjustments, ortrue_adjustmentscontain numbers iny.