Convolution is one of the basic building blocks of CNN architectures. Despite its common use, standard convolution has two main shortcomings: Content-agnostic and Computation-heavy. Dynamic filters are content-adaptive, while further increasing the computational overhead. Depth-wise convolution is a lightweight variant, but it usually leads to a drop in CNN performance or requires a larger number of channels. In this work, we propose the Decoupled Dynamic Filter (DDF) that can simultaneously tackle both of these shortcomings. Inspired by recent advances in attention, DDF decouples a depth-wise dynamic filter into spatial and channel dynamic filters. This decomposition considerably reduces the number of parameters and limits computational costs to the same level as depth-wise convolution. Meanwhile, we observe a significant boost in performance when replacing standard convolution with DDF in classification networks. ResNet50 / 101 get improved by 1.9% and 1.3% on the top-1 accuracy, while their computational costs are reduced by nearly half. Experiments on the detection and joint upsampling networks also demonstrate the superior performance of the DDF upsampling variant (DDF-Up) in comparison with standard convolution and specialized content-adaptive layers.
翻译:有线电视新闻网架构的基本构件之一的演变是CNN架构的基本构件之一。尽管有其共同的用途,标准变迁有两个主要缺陷:内容-不可知性和计算-重力。动态过滤器具有内容适应性,同时进一步增加计算间接费用。深度变迁是一个轻量变异,但通常会导致CNN性能下降,或需要更多频道。在这项工作中,我们提议分离的动态过滤器(DDDF)能够同时解决这两个缺陷。在关注的最新进展的启发下,DDDF将一个深度动态过滤器分解为空间和频道动态过滤器。这种变异使参数的数量大大降低,并将计算成本限制到与深度变异相同的水平。同时,我们看到在分类网络中用DDFFS取代标准变率时,业绩显著提高。ResNet50/101在头一精确度上提高1.9%和1.3%,而其计算成本则降低近一半。在探测和联合复制网络上进行的实验也显示DFFDF标准变式(DF-C-C-CDVDS-C)与专门变式比较的优异性。