I try to build LSTM model that as input receives sequence of integer numbers and outputs probability for each integer to appear. If this probability is low, then the integer should be considered as anomaly. I tried to follow this tutorial - https://towardsdatascience.com/lstm-autoencoder-for-extreme-rare-event-classification-in-keras-ce209a224cfb, particularly this is where my model is from. My input looks like this:
[[[3]
[1]
[2]
[0]]
[[3]
[1]
[2]
[0]]
[[3]
[1]
[2]
[0]]
However I can't understand what I gain as an output.
[[[ 2.7052343 ]
[ 1.0618575 ]
[ 1.8257084 ]
[-0.54579014]]
[[ 2.9069736 ]
[ 1.0850943 ]
[ 1.9787762 ]
[ 0.01915958]]
[[ 2.9069736 ]
[ 1.0850943 ]
[ 1.9787762 ]
[ 0.01915958]]
Is it reconstruction error? Or the probabilities for each integer? And if so, why they're not in the range of 0-1? I would be grateful if someone could explain this.
The model:
time_steps = 4
features = 1
train_keys_reshaped = train_integer_encoded.reshape(91, time_steps, features)
test_keys_reshaped = test_integer_encoded.reshape(25, time_steps, features)
model = Sequential()
model.add(LSTM(32, activation='relu', input_shape=(time_steps, features), return_sequences=True))
model.add(LSTM(16, activation='relu', return_sequences=False))
model.add(RepeatVector(time_steps)) # to convert 2D output into expected by decoder 3D
model.add(LSTM(16, activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(features)))
adam = optimizers.Adam(0.0001)
model.compile(loss='mse', optimizer=adam)
model_history = model.fit(train_keys_reshaped, train_keys_reshaped,
epochs=700,
validation_split=0.1)
predicted_probs = model.predict(test_keys_reshaped)
