As a powerful engine, vanilla convolution has promoted huge breakthroughs in various computer tasks. However, it often suffers from sample and content agnostic problems, which limits the representation capacities of the convolutional neural networks (CNNs). In this paper, we for the first time model the scene features as a combination of the local spatial-adaptive parts owned by the individual and the global shift-invariant parts shared to all individuals, and then propose a novel two-branch dual complementary dynamic convolution (DCDC) operator to flexibly deal with these two types of features. The DCDC operator overcomes the limitations of vanilla convolution and most existing dynamic convolutions who capture only spatial-adaptive features, and thus markedly boosts the representation capacities of CNNs. Experiments show that the DCDC operator based ResNets (DCDC-ResNets) significantly outperform vanilla ResNets and most state-of-the-art dynamic convolutional networks on image classification, as well as downstream tasks including object detection, instance and panoptic segmentation tasks, while with lower FLOPs and parameters.
翻译:作为强大的引擎,香草变迁促进了各种计算机任务的巨大突破,然而,香草变迁往往受到抽样和内容不可知问题的影响,限制了进化神经网络的代表性能力。在本文中,我们首次将场景特征作为个人拥有的当地空间适应部分和所有个人共享的全球转移性变化部分的组合模型,然后提出一个新的两分制双重互补动态变迁(DCDC)操作员,以便灵活处理这两类特征。DCDC操作员克服了香草变迁和大多数现有动态变迁的局限性,这些变迁只捕捉到空间适应性特征,从而明显提高了CNN的代表性能力。实验表明,DC的操作员以ResNet(DC-ResNet)为基础,大大超越了Vanilla ResNet和大多数关于图像分类的州级动态变迁网络,以及下游任务,包括物体探测、实例和光谱分割任务,同时具有较低的FLOP和参数。