Transferring a deep neural network trained on one problem to another requires only a small amount of data and little additional computation time. The same behaviour holds for ensembles of deep learning models typically superior to a single model. However, a transfer of deep neural networks ensemble demands relatively high computational expenses. The probability of overfitting also increases. Our approach for the transfer learning of ensembles consists of two steps: (a) shifting weights of encoders of all models in the ensemble by a single shift vector and (b) doing a tiny fine-tuning for each individual model afterwards. This strategy leads to a speed-up of the training process and gives an opportunity to add models to an ensemble with significantly reduced training time using the shift vector. We compare different strategies by computation time, the accuracy of an ensemble, uncertainty estimation and disagreement and conclude that our approach gives competitive results using the same computation complexity in comparison with the traditional approach. Also, our method keeps the ensemble's models' diversity higher.
翻译:在一个问题上受过训练的深神经网络向另一个问题转移只需要少量的数据和少量的额外计算时间。 同一行为对于各种深学习模型的集合而言,通常优于单一模型。 然而,深神经网络的转移需要较高的计算费用。 超配的可能性也增加。 我们的集合转移学习方法包括两个步骤:(a) 单向向向导对组合中所有模型的编码器的转换权重进行转换,以及(b) 其后对每个模型进行微小微微微微的微调。这一战略导致培训过程的加速,并提供机会将模型添加到一个使用转移矢量的训练时间大大缩短的组合中。我们通过计算时间、组合的准确性、不确定性估计和分歧来比较不同的战略,并得出结论,我们的方法与传统方法相比,使用相同的计算复杂性,具有竞争性的结果。此外,我们的方法使共同模型的多样性提高。