Assessment of model fitness is an important step in many problems. Models are typically fitted to training data by minimizing a loss function, such as the squared-error or negative log-likelihood, and it is natural to desire low losses on future data. This letter considers the use of a test data set to characterize the out-of-sample losses of a model. We propose a simple model diagnostic tool that provides finite-sample guarantees under weak assumptions. The tool is computationally efficient and can be interpreted as an empirical quantile. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyper-parameter tuning.
翻译:对模型健身性的评估是许多问题的一个重要步骤。模型通常适合培训数据,最大限度地减少损失函数,如平方错态或负日志相似度,因此自然希望未来数据损失少。本信考虑使用测试数据集来描述模型的外表损失。我们提出了一个简单的模型诊断工具,在薄弱假设下提供有限抽样保证。该工具具有计算效率,可被解释为经验定量。提出了若干项数字实验,以表明拟议方法如何量化分布转移的影响,有助于分析回归,并使得模型选择和超参数调成为可能。