In this paper we analyze, for a model of linear regression with gaussian covariates, the performance of a Bayesian estimator given by the mean of a log-concave posterior distribution with gaussian prior, in the high-dimensional limit where the number of samples and the covariates' dimension are large and proportional. Although the high-dimensional analysis of Bayesian estimators has been previously studied for Bayesian-optimal linear regression where the correct posterior is used for inference, much less is known when there is a mismatch. Here we consider a model in which the responses are corrupted by gaussian noise and are known to be generated as linear combinations of the covariates, but the distributions of the ground-truth regression coefficients and of the noise are unknown. This regression task can be rephrased as a statistical mechanics model known as the Gardner spin glass, an analogy which we exploit. Using a leave-one-out approach we characterize the mean-square error for the regression coefficients. We also derive the log-normalizing constant of the posterior. Similar models have been studied by Shcherbina and Tirozzi and by Talagrand, but our arguments are much more straightforward. An interesting consequence of our analysis is that in the quadratic loss case, the performance of the Bayesian estimator is independent of a global "temperature" hyperparameter and matches the ridge estimator: sampling and optimizing are equally good.
翻译:在本文中, 我们分析的是, 一种用百草枯共变的线性回归模型, 巴伊西亚的测深仪的性能, 一种由古撒之前的日志混血后部分布的平均值, 在高维限中, 样本数量和共变异的维度是大和比例的。 虽然巴耶西亚测深仪的高维分析以前曾为巴耶西亚- 优美的线性回归法进行了研究, 正确的后台用于推断, 但当出现不匹配时, 更不为人所知。 这里我们考虑的是一种模型, 其反应被古撒的噪音腐蚀性反应腐蚀了, 并被人们所知, 以古沙拉变的线性组合为基础, 但地面回归系数的分布和噪音是未知的。 这个回归任务可以被重新表述为统计力模型, 称为加德加德纳的螺旋玻璃, 我们利用这个比喻。 我们使用一个假一比喻的方法来描述回归系数的中度错误。 我们还从一个对正正的正值进行逻辑调整,, 也就是的比喻的比喻是我们所研究的比喻。 。 我们的平比喻的比喻, 。 我们的平比喻的比喻是更直接的模型的模型的模型是 。 。 。 我们的精确的模型的模型的模型的模型是更精确的精确的精确的模型的模型的模型的模型是 。