Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparameters can be estimated online during training, simplifying the procedure. Our marginal-likelihood estimate is based on Laplace's method and Gauss-Newton approximations to the Hessian, and it outperforms cross-validation and manual-tuning on standard regression and image classification datasets, especially in terms of calibration and out-of-distribution detection. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable (e.g., in nonstationary settings).
翻译:由于估算困难,基于边际相似度的模型选择虽然有希望,但很少在深层次学习中使用。相反,大多数方法都依赖验证数据,而这些数据可能不容易获得。在这项工作中,我们提出了一个可扩缩边际相似度估计方法,仅以培训数据为基础,选择超参数和网络结构。一些超参数可以在培训期间进行在线估计,简化程序。我们的边际相似度估计基于Laplace的方法和Gaus-Newton对赫西安人的近似值,它超过了标准回归和图像分类数据集的交叉校验和人工调整,特别是在校准和分配外检测方面。我们的工作表明,边际可能性可以改善通用性,在无法获得验证数据时(例如,在非静止环境中)有用。