I am trying to implement a convolutional layer in Python using Numpy.
The input is a 4-dimensional array of shape [N, H, W, C], where:
N: Batch sizeH: Height of imageW: Width of imageC: Number of channels
The convolutional filter is also a 4-dimensional array of shape [F, F, Cin, Cout], where
F: Height and width of a square filterCin: Number of input channels (Cin = C)Cout: Number of output channels
Assuming a stride of one along all axes, and no padding, the output should be a 4-dimensional array of shape [N, H - F + 1, W - F + 1, Cout].
My code is as follows:
import numpy as np
def conv2d(image, filter):
# Height and width of output image
Hout = image.shape[1] - filter.shape[0] + 1
Wout = image.shape[2] - filter.shape[1] + 1
output = np.zeros([image.shape[0], Hout, Wout, filter.shape[3]])
for n in range(output.shape[0]):
for i in range(output.shape[1]):
for j in range(output.shape[2]):
for cout in range(output.shape[3]):
output[n,i,j,cout] = np.multiply(image[n, i:i+filter.shape[0], j:j+filter.shape[1], :], filter[:,:,:,cout]).sum()
return output
This works perfectly, but uses four for loops and is extremely slow. Is there a better way of implementing a convolutional layer that takes 4-dimensional input and filter, and returns a 4-dimensional output, using Numpy?
filter.shape[3], is it 4dimensional?filter = np.random.randint(0, 2, [5, 5, 3, 16]). This would be a 5 X 5 filter that operates on a three channel input image and generates an output 'image' with 16 channels.