Capsule neural networks replace simple, scalar-valued neurons with vector-valued capsules. They are motivated by the pattern recognition system in the human brain, where complex objects are decomposed into a hierarchy of simpler object parts. Such a hierarchy is referred to as a parse-tree. Conceptually, capsule neural networks have been defined to realize such parse-trees. The capsule neural network (CapsNet), by Sabour, Frosst, and Hinton, is the first actual implementation of the conceptual idea of capsule neural networks. CapsNets achieved state-of-the-art performance on simple image recognition tasks with fewer parameters and greater robustness to affine transformations than comparable approaches. This sparked extensive follow-up research. However, despite major efforts, no work was able to scale the CapsNet architecture to more reasonable-sized datasets. Here, we provide a reason for this failure and argue that it is most likely not possible to scale CapsNets beyond toy examples. In particular, we show that the concept of a parse-tree, the main idea behind capsule neuronal networks, is not present in CapsNets. We also show theoretically and experimentally that CapsNets suffer from a vanishing gradient problem that results in the starvation of many capsules during training.
翻译:Capsule 神经网络用矢量值的胶囊取代简单、卡路里值值的神经神经元。 它们受到人类大脑模式识别系统的驱动, 复杂的物体分解成一个更简单的对象部件的层次。 这种等级被称为一个剖析树。 从概念上看, 胶囊神经网络被定义为可以实现这种剖析树。 Sabour、 Frosst 和 Hinton 的胶囊神经网络( CapsNet) 是首次实际实施胶囊神经网络概念概念的。 CapsNet 在简单的图像识别任务上实现了最先进的性能, 其参数更少, 并且对直系转变的强度比类似方法更强。 这引发了广泛的后续研究。 然而, 尽管做出了重大努力, 但没有一项工作能够将CappsNet结构放大到更合理规模的数据集。 在这里, 我们给出了一个失败的原因, 并争论说, 很可能不可能将CapsNet 放大到比 Toy 实例更多的范围。 特别是, 我们展示了一个剖析树的概念, 主要的理念在胶囊内层实验网络中, 也无法显示我们正在遭受的磁质性实验性研究的模型中, 。