5

I have data with 250 days, 72 features of training sample and one column of target variable. And want to predict for next 30 days for each of 21351 rows with 72 features. How will I reshape my data both input and output. It seems that I am having a little confusion and the library is giving me error about shape incompatiblity.

I was reshaping as:

trainX.reshape(1, len(trainX), trainX.shape[1])

trainY.reshape(1, len(trainX))

But gives me error:

ValueError: Input arrays should have the same number of samples as target arrays. Found 1 input samples and 250 target samples.

Same error with:

trainX.reshape(1, len(trainX), trainX.shape[1])

trainY.reshape(len(trainX), )

and same error with:

trainX.reshape(1, len(trainX), trainX.shape[1])

trainY.reshape(len(trainX), 1)

Currently, trainX is reshaped as:

trainX.reshape(trainX.shape[0], 1, trainX.shape[1])

array([[[  4.49027601e+00,  -3.71848297e-01,  -3.71848297e-01, ...,
           1.06175239e+17,   1.24734085e+06,   5.16668131e+00]],

       [[  2.05921386e+00,  -3.71848297e-01,  -3.71848297e-01, ...,
           8.44426594e+17,   1.39098642e+06,   4.01803817e+00]],

       [[  9.25515792e+00,  -3.71848297e-01,  -3.71848297e-01, ...,
           4.08800518e+17,   1.24441013e+06,   3.69129399e+00]],

       ..., 
       [[  3.80037999e+00,  -3.71848297e-01,  -3.71848297e-01, ...,
           1.35414902e+18,   1.23823291e+06,   3.54601899e+00]],

       [[  3.73994822e+00,  -3.71848297e-01,   8.40698741e+00, ...,
           3.93863169e+17,   1.25693299e+06,   3.29993440e+00]],

       [[  3.56843035e+00,  -3.71848297e-01,   1.53710656e+00, ...,
           3.28306336e+17,   1.22667253e+06,   3.36569960e+00]]])

trainY reshaped as:

trainY.reshape(trainY.shape[0], )

array([[-0.7238661 ],

       [-0.43128777],

       [-0.31542821],

       [-0.35185375],

       ...,

       [-0.28319519],

       [-0.28740503],

       [-0.24209411],

       [-0.3202021 ]])

and testX reshaped as:

testX.reshape(1, testX.shape[0], testX.shape[1])

array([[[ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
          -3.71848297e-01,   2.73982042e+06,  -3.71848297e-01],

        [ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
          -3.71848297e-01,   2.73982042e+06,  -3.71848297e-01],

        [ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
           2.00988794e+18,   1.05992636e+06,   2.49920150e+01],

       ..., 

        [ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
          -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01],

        [ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
          -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01],

        [ -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01, ...,
          -3.71848297e-01,  -3.71848297e-01,  -3.71848297e-01]]])

and error is:

ValueError: Error when checking : expected lstm_25_input to have shape (None, 1, 72) but got array with shape (1, 2895067, 72)

EDIT 1:

Here is code of my model:

trainX = trainX.reshape(trainX.shape[0], 1, trainX.shape[1])
trainY = trainY.reshape(trainY.shape[0], )
testX = testX.reshape(1, testX.shape[0], testX.shape[1])

model = Sequential()

model.add(LSTM(100, return_sequences=True, input_shape = trainX.shape[0], trainX.shape[2])))
model.add(LSTM(100))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam')

model.fit(trainX, trainY, epochs=500, shuffle=False, verbose=1)

model.save('model_lstm.h5')

model = load_model('model_lstm.h5')

prediction = model.predict(testX, verbose=0)

ValueError Traceback (most recent call last) in () 43 model.compile(loss='mse', optimizer='adam') 44 ---> 45 model.fit(exog, endog, epochs=50, shuffle=False, verbose=1) 46 47 start_date = endog_end + timedelta(days = 1)

D:\AnacondaIDE\lib\site-packages\keras\models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs) 865 class_weight=class_weight, 866 sample_weight=sample_weight, --> 867 initial_epoch=initial_epoch) 868 869 def evaluate(self, x, y, batch_size=32, verbose=1,

D:\AnacondaIDE\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs) 1520
class_weight=class_weight, 1521 check_batch_axis=False, -> 1522 batch_size=batch_size) 1523 # Prepare validation data. 1524 do_validation = False

D:\AnacondaIDE\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_axis, batch_size) 1376
self._feed_input_shapes, 1377
check_batch_axis=False, -> 1378 exception_prefix='input') 1379 y = _standardize_input_data(y, self._feed_output_names,
1380 output_shapes,

D:\AnacondaIDE\lib\site-packages\keras\engine\training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 142 ' to have shape ' + str(shapes[i]) + 143 ' but got array with shape ' + --> 144 str(array.shape)) 145 return arrays 146

ValueError: Error when checking input: expected lstm_31_input to have shape (None, 250, 72) but got array with shape (21351, 1, 72)

EDIT 2:

After trying the updated solution from @Paddy, I got this error on calling predict():


ValueError Traceback (most recent call last) in () 1 model = load_model('model_lstm.h5') 2 ----> 3 prediction = model.predict(exog_test, verbose=0) 4 # for x in range(0, len(exog_test)):

D:\AnacondaIDE\lib\site-packages\keras\models.py in predict(self, x, batch_size, verbose) 911 if not self.built: 912 self.build() --> 913 return self.model.predict(x, batch_size=batch_size, verbose=verbose) 914 915 def predict_on_batch(self, x):

D:\AnacondaIDE\lib\site-packages\keras\engine\training.py in predict(self, x, batch_size, verbose, steps) 1693 x = _standardize_input_data(x, self._feed_input_names, 1694 self._feed_input_shapes, -> 1695 check_batch_axis=False) 1696 if self.stateful: 1697 if x[0].shape[0] > batch_size and x[0].shape[0] % batch_size != 0:

D:\AnacondaIDE\lib\site-packages\keras\engine\training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 130 ' to have ' + str(len(shapes[i])) + 131 ' dimensions, but got array with shape ' + --> 132 str(array.shape)) 133 for j, (dim, ref_dim) in enumerate(zip(array.shape, shapes[i])): 134 if not j and not check_batch_axis:

ValueError: Error when checking : expected lstm_64_input to have 3 dimensions, but got array with shape (2895067, 72)

7
  • I have successfully trained the model. But it gives me error on calling predict(). Commented Oct 16, 2017 at 14:53
  • you passed the wrong dimensions to testX Commented Oct 16, 2017 at 14:54
  • @djk47463 So can you provide me the line of reshaping the testX? That's what I am struggling with. Commented Oct 16, 2017 at 16:13
  • The main concern is to get the desired output according to requirement. Commented Oct 16, 2017 at 16:13
  • @djk47463 in the chat discussion you suggested a single step of concatenating and reshaping the array. But it is not working for me on my original dataset. I have 274 rows and 72 features and want to make timesteps of length 92. but it gives me error about size incompatibility. Please help me. Commented Oct 17, 2017 at 13:54

2 Answers 2

2

You have:

trainX = trainX.reshape(trainX.shape[0], 1, trainX.shape[1])
trainY = trainY.reshape(trainY.shape[0], )
testX = testX.reshape(1, testX.shape[0], testX.shape[1])

You want:

trainX = trainX.reshape(trainX.shape[0], 1, trainX.shape[1])
trainY = trainY.reshape(trainY.shape[0], )
testX = testX.reshape(testX.shape[0],1, testX.shape[1])

You mixed up the samples and time-step dimensions in testX

Sign up to request clarification or add additional context in comments.

1 Comment

Yeah, I had tried it and it solved my issue. I was just getting the dims of testX wrong. Thank you.
1

Try this reshape:

trainX.reshape(len(trainX),1, trainX.shape[1])

trainY.reshape(len(trainX), 1)

But, in general you have two ways, either reshape input data or change the model parameters.

And please please look on the error message, it says everything here!

ok, here is an update for your code:

trainX = trainX.reshape(trainX.shape[0], trainX.shape[1],1)
trainY = trainY.reshape(trainY.shape[0],)
testX = testX.reshape(testX.shape[0], testX.shape[1], 1)

model = Sequential()

model.add(LSTM(100, return_sequences= True, input_shape=(trainX.shape[1],1) ))
model.add(LSTM(100, return_sequences= False))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam')

model.fit(trainX, trainY, epochs=500, shuffle=False, verbose=1)

model.save('model_lstm.h5')

model = load_model('model_lstm.h5')

prediction = model.predict(testX, verbose=0)

6 Comments

Yeah, I have reshaped it like you said, as well. Please look at the last three reshaping code snippets for trainX, trainY, testX respectively. Then it becomes incompatible with shape of testX (as stated in error message). Please note that I want predict mulitple steps (30 days, say) for each record in testX.
Also Why have you moved 1 to the last argument in reshape()? It seperates each feature into it's own array.
please check my answer once again :)
about the 1, normally the smallest number is going to the right, you can check how to prepare shapes for LSTM is you want to know more...
No, I don't think so. Because it will break columns of each row into rows of single element. Wesley's answer has solved the issue. Thanks again for your time :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.