In supervised batch learning, the predictive normalized maximum likelihood (pNML) has been proposed as the min-max regret solution for the distribution-free setting, where no distributional assumptions are made on the data. However, the pNML is not defined for a large capacity hypothesis class as over-parameterized linear regression. For a large class, a common approach is to use regularization or a model prior. In the context of online prediction where the min-max solution is the Normalized Maximum Likelihood (NML), it has been suggested to use NML with ``luckiness'': A prior-like function is applied to the hypothesis class, which reduces its effective size. Motivated by the luckiness concept, for linear regression we incorporate a luckiness function that penalizes the hypothesis proportionally to its l2 norm. This leads to the ridge regression solution. The associated pNML with luckiness (LpNML) prediction deviates from the ridge regression empirical risk minimizer (Ridge ERM): When the test data reside in the subspace corresponding to the small eigenvalues of the empirical correlation matrix of the training data, the prediction is shifted toward 0. Our LpNML reduces the Ridge ERM error by up to 20% for the PMLB sets, and is up to 4.9% more robust in the presence of distribution shift compared to recent leading methods for UCI sets.
翻译:在受监督的批量学习中,提出了预测的标准化最大可能性(pNML)作为无分布式分布式假设的最小最大遗憾解决方案(pNML)。然而,PNML没有被定义为超参数线性回归等大型能力假设类。对于大类,常见的方法是使用正规化或模型前。在最小最大分辨率解决方案为正常化最大相似性(NML)的在线预测中,建议使用“幸运度”的NML:对降低其有效规模的假设类适用一个前置功能。受幸运度概念的驱动,对于线性回归,我们纳入一个对假设进行相应惩罚的幸运性假设值为超参数线性回归值。对于大类,一个共同的方法是使用正规化或模型前的模型。在最小质量(LpNML)的预测中,与峰值回归性最大相似的最小性实验风险值(RMER):当测试数据位于与小二分位值值相比,降低其有效规模。对于线性回归性回归性概念概念,我们将20次的RMRMRI的最近RMRMRMRM的预测改为更精确。