The Capsule Network is widely believed to be more robust than Convolutional Networks. However, there are no comprehensive comparisons between these two networks, and it is also unknown which components in the CapsNet affect its robustness. In this paper, we first carefully examine the special designs in CapsNet that differ from that of a ConvNet commonly used for image classification. The examination reveals five major new/different components in CapsNet: a transformation process, a dynamic routing layer, a squashing function, a marginal loss other than cross-entropy loss, and an additional class-conditional reconstruction loss for regularization. Along with these major differences, we conduct comprehensive ablation studies on three kinds of robustness, including affine transformation, overlapping digits, and semantic representation. The study reveals that some designs, which are thought critical to CapsNet, actually can harm its robustness, i.e., the dynamic routing layer and the transformation process, while others are beneficial for the robustness. Based on these findings, we propose enhanced ConvNets simply by introducing the essential components behind the CapsNet's success. The proposed simple ConvNets can achieve better robustness than the CapsNet.
翻译:Capsule 网络被广泛认为比Capsule 网络更强大。 但是,这两个网络之间没有全面的比较,而且CapsNet的哪些组成部分影响其稳健性。 在本文中,我们首先仔细检查CapsNet中不同于通常用于图像分类的ConvNet的特别设计。检查揭示了CapsNet的五个新的/不同的主要组成部分:一个转型过程、一个动态的路线层、一个压倒功能、跨渗透性损失以外的边际损失,以及另外为正规化而增加的等级条件重建损失。除了这些重大差异外,我们还对三种类型的稳健性进行了全面的调整研究,包括直系变换、数字重叠和语义表示。研究显示,一些被认为对CapsNet至关重要的设计实际上会损害其稳健性,即动态的路线层和转型过程,而另一些则有利于稳健性。基于这些发现,我们建议加强ConvNet,只是通过引入Caps Caps Capres Net背后的基本组成部分,而不是更稳健性。