通过平均实地透视镜分析重量加重量自动转换器的特征学习情况分析 (Analysis of feature learning in weight-tied autoencoders via the mean field lens)

Autoencoders are among the earliest introduced nonlinear models for unsupervised learning. Although they are widely adopted beyond research, it has been a longstanding open problem to understand mathematically the feature extraction mechanism that trained nonlinear autoencoders provide. In this work, we make progress in this problem by analyzing a class of two-layer weight-tied nonlinear autoencoders in the mean field framework. Upon a suitable scaling, in the regime of a large number of neurons, the models trained with stochastic gradient descent are shown to admit a mean field limiting dynamics. This limiting description reveals an asymptotically precise picture of feature learning by these models: their training dynamics exhibit different phases that correspond to the learning of different principal subspaces of the data, with varying degrees of nonlinear shrinkage dependent on the $\ell_{2}$-regularization and stopping time. While we prove these results under an idealized assumption of (correlated) Gaussian data, experiments on real-life data demonstrate an interesting match with the theory. The autoencoder setup of interests poses a nontrivial mathematical challenge to proving these results. In this setup, the "Lipschitz" constants of the models grow with the data dimension $d$. Consequently an adaptation of previous analyses requires a number of neurons $N$ that is at least exponential in $d$. Our main technical contribution is a new argument which proves that the required $N$ is only polynomial in $d$. We conjecture that $N\gg d$ is sufficient and that $N$ is necessarily larger than a data-dependent intrinsic dimension, a behavior that is fundamentally different from previously studied setups.

翻译：自动计算器是最早推出的非线性非线性学习模型之一。虽然它们被广泛采用,超出了研究范围, 但从数学角度理解非线性自动计算器提供的特征提取机制是一个长期的开放问题。在这项工作中, 我们通过分析一个在中字段框架中的双层加权非线性非线性自动计算器的类别来解决这个问题。在大量神经元的体系中, 受过Stochacial 梯度下降训练的模型在适当规模上显示, 承认了一个卑鄙的字段限制动态。这个限制描述显示这些模型的特征学习的准确性图景: 它们的培训动态展示了不同阶段, 与以往数据的主要子空间的学习相吻合, 不同程度的非线性缩缩缩取决于$\%2} 常规化和停止时间。虽然我们证明这些结果是在一个理想化的假设( cortical) $的高度数据值数据下进行的, 实际生活中的数据实验显示与理论的匹配性。数字的自动解读器设置更精确性地表明, 美元的主要模型中, 需要不断的解算算数据分析。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

多标签学习的新趋势（2020 Survey）

专知会员服务

44+阅读 · 2020年12月6日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日