In this paper, we introduce a convolutional network which we call MultiPodNet consisting of a combination of two or more convolutional networks which process the input image in parallel to achieve the same goal. Output feature maps of parallel convolutional networks are fused at the fully connected layer of the network. We experimentally observed that three parallel pod networks (TripodNet) produce the best results in commonly used object recognition datasets. Baseline pod networks can be of any type. In this paper, we use ResNets as baseline networks and their inputs are augmented image patches. The number of parameters of the TripodNet is about three times that of a single ResNet. We train the TripodNet using the standard backpropagation type algorithms. In each individual ResNet, parameters are initialized with different random numbers during training. The TripodNet achieved state-of-the-art performance on CIFAR-10 and ImageNet datasets. For example, it improved the accuracy of a single ResNet from 91.66% to 92.47% under the same training process on the CIFAR-10 dataset.
翻译:在本文中,我们引入了一个连动网络,我们称之为多波德网,它由两个或两个以上的连动网络组合组成,它处理平行输入图像,以同时达到同一目标。平行连通网络的输出特征图在网络的完整连接层中被整合。我们实验发现,三个平行波子网络(TripodNet)在常用的物体识别数据集中产生最佳结果。基线波子网络可以是任何类型的。在本文中,我们使用ResNet作为基线网络,它们的投入是扩大的图像补丁。TripodNet的参数数目大约是单一ResNet的三倍。我们利用标准的回推进型算法对TripodNet进行培训。在每一个不同的ResNet中,在培训期间,参数都是以不同的随机数字初始化的。TripodNet在CIFAR-10和图像网数据集中实现了最先进的性能。例如,它提高了单一ResNet的精确度,从91.66%提高到92.47%。