Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.
翻译:超分配(O.O.D.)一般化仍然是现实世界机器学习系统的一项关键挑战。我们描述了一种O.O.D.一般化方法,通过培训,鼓励模型只保留网络中在多个培训领域得到良好再利用的功能。我们的方法结合了两个互为补充的神经层面的正规化器,在网络上可以有不同的二元面罩,以提取一个模块化的子网络,比原始网络取得更好的O.O.D.业绩。对两个基准数据集的初步评估证实了我们方法的希望。