Although Capsule Networks show great abilities in defining the position relationship between features in deep neural networks for visual recognition tasks, they are computationally expensive and not suitable for running on mobile devices. The bottleneck is in the computational complexity of the Dynamic Routing mechanism used between capsules. On the other hand, neural networks such as XNOR-Net are fast and computationally efficient but have relatively low accuracy because of their information loss in the binarization process. This paper proposes a new class of Fully Connected (FC) Layers by xnorizing the linear projector outside or inside the Dynamic Routing within the CapsFC layer. Specifically, our proposed FC layers have two versions, XnODR (Xnorizing Linear Projector Outside Dynamic Routing) and XnIDR (Xnorizing Linear Projector Inside Dynamic Routing). To test their generalization, we insert them into MobileNet V2 and ResNet-50 separately. Experiments on three datasets, MNIST, CIFAR-10, MultiMNIST validate their effectiveness. Our experimental results demonstrate that both XnODR and XnIDR help networks to have high accuracy with lower FLOPs and fewer parameters (e.g., 95.32\% accuracy with 2.99M parameters and 311.22M FLOPs on CIFAR-10).
翻译:虽然Capsule Net等神经网络在确定用于视觉识别任务的深神经网络特征之间的位置关系方面表现出巨大的能力,但它们在计算上费用昂贵,不适合在移动设备上运行。 瓶颈在于胶囊之间使用的动态运行机制的计算复杂性。 另一方面, XNOR-Net等神经网络由于在二进制过程中丢失了信息,因而在计算上效率较高,但准确性较低。本文建议通过在CapsFC层内动态运行,对线性投影器内外或内部的动态运行进行对线性投影器进行规范化,从而形成一种新的完全连通的层(FCFC)。 具体而言,我们提议的FC层有两种版本,即XnODR(Xnorizing Linear Projector 外部动态运行)和XnIRDR(Xnal Projor Indive Routingings) 。为了测试它们的总体性,我们把它们插入了移动网络V2和ResNet-50 单独。在三个数据集、MNIFNIST、CIFAR-10上进行实验,我们实验的结果显示, XnOPODR和XNL22的参数都低精度, 和CRL的精确度。