I made an example diagram of a scaled down version of what I'm trying to implement:
So the top two input nodes are only fully connected to the top three output nodes, and the same design applies to the bottom two nodes. So far I've come up with two ways of implementing this in PyTorch, neither of which are optimal.
The first would be to create a nn.ModuleList of many smaller Linear Layers, and during the forward pass, iterate the input through them. For the diagram's example, that would look something like this:
class Module(nn.Module):
def __init__(self):
self.layers = nn.Module([nn.Linear(2, 3) for i in range(2)])
def forward(self, input):
output = torch.zeros(2, 3)
for i in range(2):
output[i, :] = self.layers[i](input.view(2, 2)[i, :])
return output.flatten()
So this accomplishes the network in the diagram, the main issue is its very slow. I assume this is because PyTorch has to process the for loop sequentially, and can't process the input tensor in parallel.
To "vectorize" the module such that PyTorch can run it quicker, I have this implementation:
class Module(nn.Module):
def __init__(self):
self.layer = nn.Linear(4, 6)
self.mask = # create mask of ones and zeros to "block" certain layer connections
def forward(self, input):
prune.custom_from_mask(self.layer, name='weight', mask=self.mask)
return self.layer(input)
This also accomplishes the diagram's network, by using weight pruning to ensure certain weights in the fully connected layer are always zero (ex. the weight connecting the top input node to the bottom out node will always be zero, so its effectively "disconnected"). This module is much faster than the previous, as there is no for loop. The problem now is this module takes up significantly more memory. This is likely due to the fact that, even though most of the layer's weights will be zero, PyTorch still treats the network as if they are there. This implementation essentially keeps way more weights around than it needs to.
Has anyone encountered this issue before and come up with an efficient solution?
