A crucial assumption underlying the most current theory of machine learning is that the training distribution is identical to the testing distribution. However, this assumption may not hold in some real-world applications. In this paper, we propose an importance sampling based data variation robust loss (ISloss) for learning problems which minimizes the worst case of loss under the constraint of distribution deviation. The distribution deviation constraint can be converted to the constraint over a set of weight distributions centered on the uniform distribution derived from the importance sampling method. Furthermore, we reveal that there is a relationship between ISloss under the logarithmic transformation (LogISloss) and the p-norm loss. We apply the proposed LogISloss to the face verification problem on Racial Faces in the Wild dataset and show that the proposed method is robust under large distribution deviations.
翻译:最新机器学习理论所依据的一个关键假设是,培训分布与测试分布相同。然而,这一假设可能在某些现实应用中不成立。在本文中,我们建议对基于抽样的数据进行重要的变异性稳健损失(ISloss),以便解决学习问题,最大限度地减少分布偏差制约下最严重的损失案例。分布偏差限制可以转换为对一系列重度分布的限制,这些重量分布以来自重要抽样方法的统一分布为中心。此外,我们发现,在对数转换(LogISloss)和p-num损失中,ISloss之间有某种关系。我们将拟议的LogISloss应用于野生数据集中种族脸部的核实问题,并表明拟议的方法在巨大的分布偏差下是稳健的。