This is a code snippet using Keras library for creating models:
for state, action, reward, next_state, done in minibatch:
target = reward
if not done:
target = (reward + self.gamma *
np.amax(self.model.predict(next_state)[0]))
target_f = self.model.predict(state)
#print (target_f)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)
I am trying to vectorize it. The only way I think to do is : 1. Create a numpy table with each row = (state, action, reward, next_state, done, target). So, there will be "mini-batch" number of rows. 2. Update target column based on other columns as (using masked arrays):
target[done==True] ==reward
target[done==False] == reward + self.gamma
*np.amax(self.model.predict(next_state)[0])
- Now update self.model.fit(state, target_f, epochs=1, verbose=0)
NB: state is 8-D, so state vector has 8 elements.
Despite hours of efforts, I am unable to code this properly. Is it possible to actually vectorize this piece of code?