The quality of many modern machine learning models improves as model complexity increases, an effect that has been quantified, for predictive performance, with the non-monotonic double descent learning curve. Here, we address the overarching question: is there an analogous theory of double descent for models which estimate uncertainty? We provide a partially affirmative and partially negative answer in the setting of Gaussian processes (GP). Under standard assumptions, we prove that higher model quality for optimally-tuned GPs (including uncertainty prediction) under marginal likelihood is realized for larger input dimensions, and therefore exhibits a monotone error curve. After showing that marginal likelihood does not naturally exhibit double descent in the input dimension, we highlight related forms of posterior predictive loss that do exhibit non-monotonicity. Finally, we verify empirically that our results hold for real data, beyond our considered assumptions, and we explore consequences involving synthetic covariates.
翻译:随着模型复杂性的提高,许多现代机器学习模型的质量会随着模型复杂性的提高而得到改善,对于预测性表现,这种效果已经量化,具有非双向双向双向学习曲线。在这里,我们处理一个首要问题:对于估算不确定性的模型,是否有类似的双向理论?我们在高斯进程(GP)的设置中提供了部分肯定和部分否定的答案。根据标准假设,我们证明,在更大的投入层面,最有可能调整的GP(包括不确定性预测)的更高模型质量(包括不确定性预测)已经实现,因此出现了一个单一的错误曲线。在显示边际可能性不会自然显示输入层面的双向下降之后,我们突出了相关形式的后向预测性损失,这显示了非双向性。最后,我们从经验上核实了我们的结果是否支持了真实数据,超出了我们考虑的假设,我们探索了合成共产物的后果。