Deep convolutional neural networks (CNN) have achieved the unwavering confidence in its performance on image processing tasks. The CNN architecture constitutes a variety of different types of layers including the convolution layer and the max-pooling layer. CNN practitioners widely understandthe fact that the stability of learning depends on how to initialize the model parameters in each layer. Nowadays, no one doubts that the de facto standard scheme for initialization is the so-called Kaiming initialization that has been developed by He et al. The Kaiming scheme was derived from a much simpler model than the currently used CNN structure having evolved since the emergence of the Kaiming scheme. The Kaiming model consists only of the convolution and fully connected layers, ignoring the max-pooling layer and the global average pooling layer. In this study, we derived the initialization scheme again not from the simplified Kaiming model, but precisely from the modern CNN architectures.
翻译:深相神经网络(CNN)在图像处理任务的表现方面获得了坚定不移的信心。CNN架构由不同种类的层次组成,包括卷动层和最大集合层。CNN从业者广泛理解,学习的稳定性取决于如何在每一层中初始化模型参数。如今,没有人怀疑事实上的初始化标准计划是赫等人开发的所谓开明初始化计划。Kaiming计划来自一个比自开明计划出现以来目前使用的CNN结构演变的简单得多的模式。Kaiming模式仅包括共和和完全相连的层,忽略了最大集合层和全球平均集合层。在本研究中,我们从简化的Kaiming模式中,而准确地说从现代CNN架构中推导出初始化计划。