Performance of machine learning models may differ between training and deployment for many reasons. For instance, model performance can change between environments due to changes in data quality, observing a different population than the one in training, or changes in the relationship between labels and features. These changes result in distribution shifts across environments. Attributing model performance changes to specific shifts is critical for identifying sources of model failures, and for taking mitigating actions that ensure robust models. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.
翻译:机器学习模型的性能可能因许多原因在培训和部署之间有差异。例如,模型性能可能因数据质量的变化、观察到与培训中不同的人口、或标签和特征之间关系的变化而改变环境。这些变化导致环境分布的变化。将模型性能变化归因于特定的变化,对于确定模型失败的来源和采取确保稳健模型的缓解行动至关重要。在这项工作中,我们引入了将环境之间的性能差异归因于基本数据生成机制的分布变化的问题。我们将问题发展成一个合作游戏,将参与者分布成一个合作游戏。我们定义了一组分布的价值,在只有这一组分布在环境之间发生变化时,它将成为模型性能变化的价值,并提出了计算任意分布组合价值的重要加权方法。然后,将每个分布对总体性能变化的贡献量化为其模糊值。我们展示了我们在合成、半合成和真实世界案例研究中的方法的正确性和实用性,显示其将性性能变化归因于广泛的分布变化范围。