Many machine learning problems encode their data as a matrix with a possibly very large number of rows and columns. In several applications like neuroscience, image compression or deep reinforcement learning, the principal subspace of such a matrix provides a useful, low-dimensional representation of individual data. Here, we are interested in determining the $d$-dimensional principal subspace of a given matrix from sample entries, i.e. from small random submatrices. Although a number of sample-based methods exist for this problem (e.g. Oja's rule \citep{oja1982simplified}), these assume access to full columns of the matrix or particular matrix structure such as symmetry and cannot be combined as-is with neural networks \citep{baldi1989neural}. In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns. Our method consists in defining a loss function whose minimizer is the desired principal subspace, and constructing a gradient estimate of this loss whose bias can be controlled. We complement our theoretical analysis with a series of experiments on synthetic matrices, the MNIST dataset \citep{lecun2010mnist} and the reinforcement learning domain PuddleWorld \citep{sutton1995generalization} demonstrating the usefulness of our approach.
翻译:许多机器学习问题都将其数据编成一个可能非常庞大的行和列的矩阵。在神经科学、图像压缩或深度增强学习等若干应用中,这种矩阵的主要子空间为个人数据提供了一个有用的、低维的表达面。在这里,我们有兴趣从样本条目中确定一个特定矩阵的以美元为维的主要子空间,即从小随机子矩阵中。虽然存在一些基于样本的方法(例如Oja's rules \citep{oja1982simplilized}),但这些应用中,这些应用了矩阵或特定矩阵结构的完整列,如对称,这种矩阵的主要子空间空间不能与神经网络结合起来。在本文中,我们得出一种算法,从样本条目中学习一个主要的子空间。当大约的子空间由神经网络代表时,因此可以缩放到具有无限数量的行和列的数据集。我们的方法包括确定一个损失最小的矩阵或特定矩阵结构结构结构结构,我们用来确定一个最小化的演示功能,而我们的模型模型模型模型模型模型模型的模型的模型的模型分析是我们所要控制的磁性模型的模型的模型的模型的模型的模型的模型的模型。