class _ConvNd(Module): __constants__ = ['stride', 'padding', 'dilation', 'groups', 'bias'] def __init__(self, in_channels, out_channels, kernel_size, stride, padding, dilation, transposed, output_padding, groups, bias): super(_ConvNd, self).__init__() if in_channels % groups != 0: raise ValueError('in_channels must be divisible by groups') if out_channels % groups != 0: raise ValueError('out_channels must be divisible by groups') self.in_channels = in_channels self.out_channels = out_channels self.kernel_size = kernel_size self.stride = stride self.padding = padding self.dilation = dilation self.transposed = transposed self.output_padding = output_padding self.groups = groups if transposed: self.weight = Parameter(torch.Tensor( in_channels, out_channels // groups, *kernel_size)) else: self.weight = Parameter(torch.Tensor( out_channels, in_channels // groups, *kernel_size)) if bias: self.bias = Parameter(torch.Tensor(out_channels)) else: self.register_parameter('bias', None) self.reset_parameters()
我们可以清楚的发现,其实
weights或者是
bias的初始化就是一般地初始化一个符合一定尺寸要求的
Tensor即可了,我们也可以发现其在
forward过程中并没有所真的去根据输入进行权值的所谓“转置”之类的操作。因此我认为只要一般地进行随机初始化即可了。
而且,我们如果同时去观察
torch.nn.Conv2d的类的话,其实也可以发现,其参数都是通过
_ConvNd去进行初始化的,因此
Conv2d和
ConvTranspose2D的参数初始化除了尺寸的区别,其他应该类似。
引用1、A guide to convolution arithmetic for deep learning.(Vincent Dumoulin, Francesco Visin).[https://arxiv.org/abs/1603.07285]2、Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks.(Alec Radford, Luke Metz, Soumith Chintala) [https://arxiv.org/pdf/1511.06434v2.pdf]3、Fully Convolutional Networks for Semantic Segmentation.(Jonathan Long, Evan Shelhamer, Trevor Darrell) [https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf]4、Deconvolution and Checkerboard Artifacts.(Augustus Odena, Vincent Dumoulin, Chris Olah) [https://distill.pub/2016/deconv-checkerboard/]