Assessment of model fitness is a key part of machine learning. The standard paradigm is to learn models by minimizing a chosen loss function averaged over training data, with the aim of achieving small losses on future data. In this paper, we consider the use of a finite calibration data set to characterize the future, out-of-sample losses of a model. We propose a simple model diagnostic tool that provides finite-sample guarantees under weak assumptions. The tool is simple to compute and to interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyper-parameter tuning.
翻译:评估模型是否适合是机器学习的一个关键部分。标准范式是学习模型,最大限度地减少与培训数据相比平均损失的选定损失函数,目的是在未来数据中实现少量损失。在本文件中,我们考虑使用一个限定校准数据集来描述一个模型的未来、超出典型的损失。我们提出了一个简单的模型诊断工具,在薄弱的假设下提供有限抽样的保证。该工具易于计算和解释。提供了几个数字实验,以显示拟议方法如何量化分布转移的影响,帮助分析回归,并使得模型选择和超参数调成为可能。