Bayesian 与深线线网络的内插 (Bayesian Interpolation with Deep Linear Networks)

This article concerns Bayesian inference using deep linear networks with output dimension one. In the interpolating (zero noise) regime we show that with Gaussian weight priors and MSE negative log-likelihood loss both the predictive posterior and the Bayesian model evidence can be written in closed form in terms of a class of meromorphic special functions called Meijer-G functions. These results are non-asymptotic and hold for any training dataset, network depth, and hidden layer widths, giving exact solutions to Bayesian interpolation using a deep Gaussian process with a Euclidean covariance at each layer. Through novel asymptotic expansions of Meijer-G functions, a rich new picture of the role of depth emerges. Specifically, we find that the posteriors in deep linear networks with data-independent priors are the same as in shallow networks with evidence maximizing data-dependent priors. In this sense, deep linear networks make provably optimal predictions. We also prove that, starting from data-agnostic priors, Bayesian model evidence in wide networks is only maximized at infinite depth. This gives a principled reason to prefer deeper networks (at least in the linear case). Finally, our results show that with data-agnostic priors a novel notion of effective depth given by \[\#\text{hidden layers}\times\frac{\#\text{training data}}{\text{network width}}\] determines the Bayesian posterior in wide linear networks, giving rigorous new scaling laws for generalization error.

翻译：文章涉及使用输出维度为一的深线网络进行巴伊斯推断。在内推( 零噪音) 制度中, 我们显示, 高萨重量前端和微微日志负对数值丢失后, 预测后背和巴伊西亚模型证据可以封闭形式, 即称为 Meijer- G 函数的线性特殊功能类别。这些结果不方便, 并用于任何培训数据集、网络深度和隐藏层宽度, 提供精确的贝伊斯内推法解决方案。在使用带有 Euclidean 精度的深高斯进程在每个层进行精确的解决。通过新颖的Meijer- G 函数的无光度扩展和 MSE 负对日志的负日志丢失, 深度作用的丰富新图示。具体地说, 我们发现, 深度线性网络中的远线性与浅网络相同, 其证据以最大程度为数据依赖前端。深线性网络提供最优化的预测。我们还证明, 从数据- 数据- 直线性前端网络开始, 直线性直线性开始, 直线性开始, 直线性最终显示直线性直线性直线性直线性数据直线性数据直线性显示直线性数据数据直判的直判直径直径的直径的的的直径直系直系直径的的直径直径的。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日