Maxpooling 2x2 array only using numpy

Question

I want help in maxpooling using numpy. I am learning Python for data science, here I have to do maxpooling and average pooling for 2x2 matrix, the input can be 8x8 or more but I have to do maxpool for every 2x2 matrix. I have created an matrix by using

k = np.random.randint(1,64,64).reshape(8,8)

So hereby I will be getting 8x8 matrix as a random output. Form the result I want to do 2x2 max pooling. Thanks in advance

I tried to split the array but didn’t worked as I expected — Arockia Jegan
– Arockia Jegan, Commented Sep 25, 2021 at 9:08
Can you post the code and what's happening that you don't expect? Just copy pasting a function someone gives you won't help you learn it — Robin Gertenbach
– Robin Gertenbach, Commented Sep 25, 2021 at 9:14
This is what I have executed in kaggle notebook , I don’t know how to elaborate it more, this is my assignment and I’m totally new to Python numpy — Arockia Jegan
– Arockia Jegan, Commented Sep 25, 2021 at 9:18
So far all we can see is creating a matrix. You say you tried to split the array hwo did oyu do it? why is it not doing what you expect? — Robin Gertenbach
– Robin Gertenbach, Commented Sep 25, 2021 at 9:21

Andras Deak -- Слава Україні · Accepted Answer · 2021-09-25 10:21:26Z

4

You don't have to compute the necessary strides yourself, you can just inject two auxiliary dimensions to create a 4d array that's a 2d collection of 2x2 block matrices, then take the elementwise maximum over the blocks:

import numpy as np

# use 2-by-3 size to prevent some subtle indexing errors
arr = np.random.randint(1, 64, 6*4).reshape(6, 4)

m, n = arr.shape
pooled = arr.reshape(m//2, 2, n//2, 2).max((1, 3))

An example instance of the above:

>>> arr
array([[40, 24, 61, 60],
       [ 8, 11, 27,  5],
       [17, 41,  7, 41],
       [44,  5, 47, 13],
       [31, 53, 40, 36],
       [31, 23, 39, 26]])

>>> pooled
array([[40, 61],
       [44, 47],
       [53, 40]])

For a completely general block pooling that doesn't assume 2-by-2 blocks:

import numpy as np

# again use coprime dimensions for debugging safety
block_size = (2, 3)
num_blocks = (7, 5)
arr_shape = np.array(block_size) * np.array(num_blocks)
numel = arr_shape.prod()
arr = np.random.randint(1, numel, numel).reshape(arr_shape)

m, n = arr.shape  # pretend we only have this
pooled = arr.reshape(m//block_size[0], block_size[0],
                     n//block_size[1], block_size[1]).max((1, 3))

answered Sep 25, 2021 at 10:21

Andras Deak -- Слава Україні

35.4k13 gold badges94 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Andras Deak -- Слава Україні Over a year ago

@ArockiaJegan I suggest avoiding stride_tricks.as_strided unless really necessary. It's easy to end up with garbage data. We have high-level tools like transpose and reshape to do everything safely.

Sam-gege Over a year ago

when you say really necessary, do you mean when different stride or dilation is involved, like MaxPool2d in pytorch? can reshape also deal with those cases? Thanks!

Andras Deak -- Слава Україні Over a year ago

@Sam-gege "really necessary" is what you can't solve with reshape, transpose or view. I've had one use case so far with as_strided, which was rendered moot with numpy.org/devdocs/reference/generated/…

Andras Deak -- Слава Україні Over a year ago

And I don't know pytorch. But looking at github.com/vdumoulin/conv_arithmetic/blob/master/README.md (linked from pytorch docs): seems like padding is not a problem, but indeed arbitrary strides might be problematic. I'd probably go for this approach (when applicable) or sliding_window_view (but skipping windows as required by strides).

Sam-gege Over a year ago

Thanks Andras. looks like sliding_window_view is easier. I've had some hard time in the beginning experimenting as_strided, often ended up in garbage data lol. BTW, I've got another similar question regarding max pooling, are you interested to have a look? stackoverflow.com/questions/69423484/…

|

Akshay Sehgal · Accepted Answer · 2021-09-25 10:21:07Z

You can solve the convolution part using np.lib.stride_tricks which is actually how the numpy generates views from its methods in the background. Be careful though, this is memory level access to numpy arrays.

Convolve over the (8,8) matrix to get (4,4) matrices of (2,2) shape.
Reduce the (2,2) matrics with a pooling operation such as mean to get a (4,4) output.

This approach is scalable to larger matrices without any modification and can accommodate larger convolutions as well.

k = np.random.randint(1,64,64).reshape(8,8)

#Strides
x,y = 2,2

shape = k.shape[0]//x, k.shape[1]//y, x, y  
strides = k.strides[0]*x, k.strides[1]*y, k.strides[0], k.strides[1]

print('expected shape:',shape)
print('required strides:',strides)

convolve = np.lib.stride_tricks.as_strided(k, shape=shape, strides=strides)
print('convolution output shape:',convolve.shape)

maxpool = np.mean(convolve, axis=(-1,-2))
print('maxpooled output shape:',maxpool.shape)


print(' ')
print('Input matrix:')
print(k)
print('--------')
print('Output matrix:')
print(maxpool)

expected shape: (4, 4, 2, 2)
required strides: (128, 16, 64, 8)
convolution output shape: (4, 4, 2, 2)
maxpooled output shape: (4, 4)
 
Input matrix:
[[19 32 28 25 31 49 17 18]
 [ 4 19 50 57 29 42  5  8]
 [44 16 54 13 15  1 58 50]
 [18 36 29 12 39 45 47 44]
 [34 31 17 28 35 62 30 54]
 [38 50 14 50 25 24 36  4]
 [58 27 20 34 55 22 63 59]
 [61 30 37 24 23 34  5 16]]
--------
Output matrix:
[[18.5  40.   37.75 12.  ]
 [28.5  27.   25.   49.75]
 [38.25 27.25 36.5  31.  ]
 [44.   28.75 33.5  35.75]]

Just to confirm, if you take just the first (2,2) window of your matrix and apply mean pooling on it, you get 18.5 which is the first value of your output matrix, as expected.

first_window = [[19,32],
                 [4,19]]

np.mean(first_window)

# 18.5

EXPLANATION

Numpy stores its ndarrays as contiguous blocks of memory. Each element is stored in a sequential manner every n bytes after the previous.

So if your 3D array looks like this -

np.arange(0,16).reshape(2,2,4)

#array([[[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7]],
#
#       [[ 8,  9, 10, 11],
#        [12, 13, 14, 15]]])

Then in memory its stores as -

When retrieving an element (or a block of elements), NumPy calculates how many strides (of 8 bytes each) it needs to traverse to get the next element in that direction/axis. So, for the above example, for axis=2 it has to traverse 8 bytes (depending on the datatype) but for axis=1 it has to traverse 8*4 bytes, and axis=0 it needs 8*8 bytes.

This is where arr.strides comes in. It shows the number of bytes required to access the next element in that direction.

For your case with the (8,8) matrix -

You want to convolve the 8x8 matrix by a (2,2) step in each direction, therefore resulting in a (4,4,2,2) shaped matrix. Then you want to reduce the last 2 dimensions in your maxpooling step with an average resulting in a (4,4) matrix.
The shape is what you define as your expected shape which is (4,4,2,2) in this case
The convolution needs to access memory however by take 2 steps in each direction (k.strides[0]*2 = 128 bytes and k.strides1*2 = 16 bytes to get the first element of the (2,2) window, then for another (64,8) bytes.

NOTE: The try to NEVER hardcode the strides/shapes in this function. Can result in memory issue. Always use calculate the expected strides and shape from the strides and shapes of the original matrix.

Hope this helps. Read more about stride_tricks here and here.

Ammazing , just awesome, but I have to learn about strides and others, anyway thanks man
Definitely do. If you want to master numpy, stride_tricks is absolutely essential since it allows you to work with arrays at memory level and do anything you want with them. Its insanely powerful and is the actual method that majority of the functions in numpy actually use in their background.
Check the last link that I have linked in my answer. its a great tutorial of 25 examples to use, understand and master stride tricks over numpy arrays.. including stuff like accessing values in zig zag way or a simple transpose.

Collectives™ on Stack Overflow

Maxpooling 2x2 array only using numpy

2 Answers 2

6 Comments

EXPLANATION

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

EXPLANATION

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related