The paper DropConnectPapper introduces a regularization technique that is similar to Dropout, but instead of dropping out individual units, it drops out individual connections between units. This is done by applying a mask to the weights of the network, which is sampled from a Bernoulli distribution.
NOT READY YET!
Let
For training, a mask matrix
For a single example, the implementation is straightforward, just apply a mask
Therefore, a mask tensor
In order to update the weight matrix
So there is no need to implement an additional backpropagation operation, and only the Hadamard product already provided by Pytorch is needed.