Recent efforts to unravel the mystery of implicit regularization in deep learning have led to a theoretical focus on matrix factorization -- matrix completion via linear neural network. As a step further towards practical deep learning, we provide the first theoretical analysis of implicit regularization in tensor factorization -- tensor completion via certain type of non-linear neural network. We circumvent the notorious difficulty of tensor problems by adopting a dynamical systems perspective, and characterizing the evolution induced by gradient descent. The characterization suggests a form of greedy low tensor rank search, which we rigorously prove under certain conditions, and empirically demonstrate under others. Motivated by tensor rank capturing the implicit regularization of a non-linear neural network, we empirically explore it as a measure of complexity, and find that it captures the essence of datasets on which neural networks generalize. This leads us to believe that tensor rank may pave way to explaining both implicit regularization in deep learning, and the properties of real-world data translating this implicit regularization to generalization.
翻译:最近为解开深层学习中隐性正规化的奥秘而作的努力导致对矩阵要素化的理论关注 -- -- 通过线性神经网络完成矩阵化。作为迈向实际深层次学习的一步,我们首次从理论角度分析了在强度因素化中的隐性正规化 -- -- 通过某种非线性神经网络完成。我们通过采用动态系统视角来绕过臭名昭著的“高”问题难题,并将梯度下降引起的演进定性为特征。特征描述表明一种贪婪的低等级搜索形式,我们在某些条件下严格地证明了这种形式,并在其他条件下以经验方式证明了这种形式。受高等排名的驱动,抓住了非线性神经网络的隐性正规化,我们实证性地探索了它,将其作为一种复杂性的尺度,并发现它捕捉了神经网络一般化的数据集的精髓。这使我们相信,在深层学习中,高等级可能会为解释隐性正规化而真实世界数据的性质,从而将这种隐性规范化转化为一般化。