Learning robust models that generalize well under changes in the data distribution is critical for real-world applications. To this end, there has been a growing surge of interest to learn simultaneously from multiple training domains -- while enforcing different types of invariance across those domains. Yet, all existing approaches fail to show systematic benefits under controlled evaluation protocols. In this paper, we introduce a new regularization -- named Fishr -- that enforces domain invariance in the space of the gradients of the loss: specifically, the domain-level variances of gradients are matched across training domains. Our approach is based on the close relations between the gradient covariance, the Fisher Information and the Hessian of the loss: in particular, we show that Fishr eventually aligns the domain-level loss landscapes locally around the final weights. Extensive experiments demonstrate the effectiveness of Fishr for out-of-distribution generalization. Notably, Fishr improves the state of the art on the DomainBed benchmark and performs consistently better than Empirical Risk Minimization. The code is released at https://github.com/alexrame/fishr.
翻译:在数据分配变化下广泛推广的强有力学习模型对于现实世界应用至关重要。为此,人们越来越有兴趣同时从多个培训领域学习 -- -- 同时在这些领域实施不同类型的差异。然而,所有现有办法都没有显示受控评价协议下的系统效益。在本文件中,我们引入了新的规范化 -- -- 名为Fisher -- -- 在损失的梯度空间中强制实施域差异:具体地说,不同培训领域的梯度的域级差异是相匹配的。我们的方法基于渐变、渔业信息与损失的赫西安之间的密切关系:特别是,我们显示Fisher最终将最终重量周围的域级损失场划为一体。广泛的实验显示Fisher对分配外一般化的有效性。值得注意的是,Fisherr改进了DomainBed基准的科技状况,并持续地比Empical风险最小化做得更好。该代码公布在https://github.com/alexrame/fisher。