Recent works have investigated deep learning models trained by optimising PAC-Bayes bounds, with priors that are learnt on subsets of the data. This combination has been shown to lead not only to accurate classifiers, but also to remarkably tight risk certificates, bearing promise towards self-certified learning (i.e. use all the data to learn a predictor and certify its quality). In this work, we empirically investigate the role of the prior. We experiment on 6 datasets with different strategies and amounts of data to learn data-dependent PAC-Bayes priors, and we compare them in terms of their effect on test performance of the learnt predictors and tightness of their risk certificate. We ask what is the optimal amount of data which should be allocated for building the prior and show that the optimum may be dataset dependent. We demonstrate that using a small percentage of the prior-building data for validation of the prior leads to promising results. We include a comparison of underparameterised and overparameterised models, along with an empirical study of different training objectives and regularisation strategies to learn the prior distribution.
翻译:最近的工作调查了通过优化PAC-Bayes边框培训的深层次学习模式,在数据子集上学习了前科,这种结合不仅导致准确的分类,而且导致风险证书极为严格,保证进行自我认证的学习(即利用所有数据学习预测器并验证其质量),在这项工作中,我们从经验上调查了前科的作用。我们试验了6个数据集,采用了不同的战略和数据数量,以学习依赖数据的PAC-Bayes前科。我们比较了这些数据集对所学预测器测试性能的影响及其风险证书的紧凑性。我们询问为建立前科所应分配的数据的最佳数量,并表明最佳的数据可能取决于数据集。我们证明,使用一小部分前科数据来验证前科的预期结果。我们比较了参数不足和过于精确的模型,同时对不同的培训目标和常规化战略进行了经验性研究,以了解前科的分配情况。