Although overparameterized models have shown their success on many machine learning tasks, the accuracy could drop on the testing distribution that is different from the training one. This accuracy drop still limits applying machine learning in the wild. At the same time, importance weighting, a traditional technique to handle distribution shifts, has been demonstrated to have less or even no effect on overparameterized models both empirically and theoretically. In this paper, we propose importance tempering to improve the decision boundary and achieve consistently better results for overparameterized models. Theoretically, we justify that the selection of group temperature can be different under label shift and spurious correlation setting. At the same time, we also prove that properly selected temperatures can extricate the minority collapse for imbalanced classification. Empirically, we achieve state-of-the-art results on worst group classification tasks using importance tempering.
翻译:尽管过度量化模型在许多机器学习任务中表现出成功,但精确度可能会下降,因为测试分布与培训分布不同。 这种精确度下降仍然限制了机器在野外学习的应用。 同时,处理分配转移的传统技术,即重要性加权,在经验上和理论上都证明对过度量化模型影响较小或甚至没有影响。 在本文中,我们提出必须缓和作用,以改善决定边界,并为过度量化模型取得一致更好的结果。理论上,我们证明在标签转换和虚假的关联设置下,群体温度的选择可能有所不同。 同时,我们也证明适当选择的温度可以将少数群体的崩溃归因于不平衡的分类。 偶然的是,我们利用重要性调节,在最差的分组分类任务上取得最先进的结果。