How can I divide a numpy array into n sub-arrays using a sliding window of size m? [duplicate]

Question

I have a big NumPy array that I want to divide into many subarrays by moving a window of a particular size, here's my code in the case of subarrays of size 11:

import numpy as np

x = np.arange(10000)
T = np.array([])

for i in range(len(x)-11):
    s = x[i:i+11]
    T = np.concatenate((T, s), axis=0)

But it is very slow for arrays having more than 1 million entries, is there any tip to make it faster?

I don't know what your overall objective is. But you should probably start with numpy.asarray and from there if you can numpy.split if you want sub-arrays or numpy.reshape instead of whatever concatenation you're doing. — waffles
– waffles, Commented Dec 11, 2019 at 2:34

Quang Hoang · Accepted Answer · 2019-12-11 02:54:21Z

3

Actually, this is a case for as_strided:

from numpy.lib.stride_tricks import as_strided

# set up
x = np.arange(1000000); windows = 11

# strides of x
stride = x.strides;

T = as_strided(x, shape=(len(x)-windows+1, windows), strides=(stride, stride))

Output:

array([[     0,      1,      2, ...,      8,      9,     10],
       [     1,      2,      3, ...,      9,     10,     11],
       [     2,      3,      4, ...,     10,     11,     12],
       ...,
       [999987, 999988, 999989, ..., 999995, 999996, 999997],
       [999988, 999989, 999990, ..., 999996, 999997, 999998],
       [999989, 999990, 999991, ..., 999997, 999998, 999999]])

Performance:

5.88 µs ± 1.27 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

answered Dec 11, 2019 at 2:54

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

I think your current method does not produce what you are describing. Here is a faster method which splits a long array into many sub arrays using list comprehension:

Code Fix:

import numpy as np 

x = np.arange(10000)
T = np.array([])

T = np.array([np.array(x[i:i+11]) for i in range(len(x)-11)])

Speed Comparison:

sample_1 = '''
import numpy as np 

x = np.arange(10000)
T = np.array([])

for i in range(len(x)-11):
    s = x[i:i+11]
    T = np.concatenate((T, s),axis=0)

'''    

sample_2 = '''
import numpy as np 

x = np.arange(10000)
T = np.array([])

T = np.array([np.array(x[i:i+11]) for i in range(len(x)-11)])
'''

# Testing the times
import timeit
print(timeit.timeit(sample_1, number=1))
print(timeit.timeit(sample_2, number=1))

Speed Comparison Output:

5.839815437000652   # Your method
0.11047088200211874 # List Comprehension

I only checked 1 iteration as the difference is quite significant and many iterations would not change the overall outcome.

Output Comparison:

# Your method:
[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99600000e+03
   9.99700000e+03   9.99800000e+03]

# Using List Comprehension:
[[   0    1    2 ...,    8    9   10]
 [   1    2    3 ...,    9   10   11]
 [   2    3    4 ...,   10   11   12]
 ..., 
 [9986 9987 9988 ..., 9994 9995 9996]
 [9987 9988 9989 ..., 9995 9996 9997]
 [9988 9989 9990 ..., 9996 9997 9998]]

You can see that my method actually produces sub-arrays, unlike what your provided code does.

Note:

These tests were carried out on x which was just a list of ordered numbers from 0 to 10000.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Dec 11, 2019 at 2:39

lbragile

8,1703 gold badges38 silver badges73 bronze badges

2 Comments

lbragile Over a year ago

Also see that range() automatically starts at 0 so there is no need to specify that. Furthermore, your code produced a 1D array due to the concatenation rather than a 2D array (array of arrays).

Fourat Thamri Over a year ago

Thank you it worked well and faster, I was reshaping my output to get the same result as you.

Collectives™ on Stack Overflow

How can I divide a numpy array into n sub-arrays using a sliding window of size m? [duplicate]

2 Answers 2

Comments

Code Fix:

Speed Comparison:

Speed Comparison Output:

Output Comparison:

Note:

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Code Fix:

Speed Comparison:

Speed Comparison Output:

Output Comparison:

Note:

2 Comments

Linked

Related