通过测试时神经网络进行转移学习 (Transfer Learning via Test-Time Neural Networks Aggregation)

It has been demonstrated that deep neural networks outperform traditional machine learning. However, deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution due to the domain shift. In order to tackle this known issue, several transfer learning approaches have been proposed, where the knowledge of a trained model is transferred into another to improve performance with different data. However, most of these approaches require additional training steps, or they suffer from catastrophic forgetting that occurs when a trained model has overwritten previously learnt knowledge. We address both problems with a novel transfer learning approach that uses network aggregation. We train dataset-specific networks together with an aggregation network in a unified framework. The loss function includes two main components: a task-specific loss (such as cross-entropy) and an aggregation loss. The proposed aggregation loss allows our model to learn how trained deep network parameters can be aggregated with an aggregation operator. We demonstrate that the proposed approach learns model aggregation at test time without any further training step, reducing the burden of transfer learning to a simple arithmetical operation. The proposed approach achieves comparable performance w.r.t. the baseline. Besides, if the aggregation operator has an inverse, we will show that our model also inherently allows for selective forgetting, i.e., the aggregated model can forget one of the datasets it was trained on, retaining information on the others.

翻译：事实表明,深神经网络比传统机器学习要好,但深神经网络比传统机器学习要好,但是,深神经网络缺乏通用性,也就是说,由于领域转移,它们不会像从不同分布中从不同分布中抽出的新(测试)系统那样好。为了解决这一已知问题,提出了若干转让学习方法,将受过训练的模式知识转移到另一个模式,用不同数据改进性能。然而,大多数这些方法都需要额外的培训步骤,或者当经过训练的模式超越了以前所学的知识,它们就会发生灾难性的忘记。我们用使用网络聚合的新式的传输学习方法来解决这两个问题。我们在一个统一的框架中将特定数据集与汇总网络一起培训。损失功能包括两个主要部分:任务特定损失(如交叉滴滴)和汇总损失。拟议的汇总损失使我们的模式能够了解经过训练的深度网络参数如何与一个汇总操作者合并。我们证明,在测试时,在测试时,在没有任何进一步的培训步骤的情况下,将模型汇总,减轻向简单计算操作的转移负担。我们提出的方法在统一的框架中,我们所要达到可比较的业绩模式(例如交叉),然后,我们又可以重新确定一个数据库。