Filter is the key component in modern convolutional neural networks (CNNs). However, since CNNs are usually over-parameterized, a pre-trained network always contain some invalid (unimportant) filters. These filters have relatively small $l_{1}$ norm and contribute little to the output (\textbf{Reason}). While filter pruning removes these invalid filters for efficiency consideration, we tend to reactivate them to improve the representation capability of CNNs. In this paper, we introduce filter grafting (\textbf{Method}) to achieve this goal. The activation is processed by grafting external information (weights) into invalid filters. To better perform the grafting, we develop a novel criterion to measure the information of filters and an adaptive weighting strategy to balance the grafted information among networks. After the grafting operation, the network has fewer invalid filters compared with its initial state, enpowering the model with more representation capacity. Meanwhile, since grafting is operated reciprocally on all networks involved, we find that grafting may lose the information of valid filters when improving invalid filters. To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting . Extensive experiments are performed on the classification and recognition tasks to show the superiority of our method. Code is available at \textcolor{black}{\emph{https://github.com/fxmeng/filter-grafting}}.
翻译:过滤器是现代神经神经神经网络(CNNNs)的关键组成部分。 但是,由于CNN通常使用过量参数, 接受过培训的网络总是包含一些无效( 不重要) 过滤器。 这些过滤器的常规值相对较小, 且对输出贡献不大( \ textbf{ Reason} ) 。 虽然过滤器的运行去除了这些无效过滤器, 以提高效率为考量, 我们倾向于重新启动这些无效过滤器, 以提高CNN的显示能力。 在本文中, 我们引入过滤器的过滤器( textbff{Method} ) 来实现这一目标。 启动程序是通过将外部信息( 重量) 移植到无效过滤器来处理的。 为了更好地执行, 我们开发了一个新标准, 测量过滤器的信息和调整加权战略来平衡网络中刻画的信息。 在调动操作后, 网络的无效过滤器比初始状态要少一些。 同时, 在所有网络上, 粘贴的操作是对等操作, 我们发现, 将外部信息( 重量) 复制的过滤器的过滤器会失去正常的过滤器。 在改进过程中, 我们的过滤器上会显示无效的过滤器的过滤器是无效的。