Subspace learning and matrix factorization problems have great many applications in science and engineering, and efficient algorithms are critical as dataset sizes continue to grow. Many relevant problem formulations are non-convex, and in a variety of contexts it has been observed that solving the non-convex problem directly is not only efficient but reliably accurate. We discuss convergence theory for a particular method: first order incremental gradient descent constrained to the Grassmannian. The output of the algorithm is an orthonormal basis for a $d$-dimensional subspace spanned by an input streaming data matrix. We study two sampling cases: where each data vector of the streaming matrix is fully sampled, or where it is undersampled by a sampling matrix $A_t\in \mathbb{R}^{m\times n}$ with $m\ll n$. Our results cover two cases, where $A_t$ is Gaussian or a subset of rows of the identity matrix. We propose an adaptive stepsize scheme that depends only on the sampled data and algorithm outputs. We prove that with fully sampled data, the stepsize scheme maximizes the improvement of our convergence metric at each iteration, and this method converges from any random initialization to the true subspace, despite the non-convex formulation and orthogonality constraints. For the case of undersampled data, we establish monotonic expected improvement on the defined convergence metric for each iteration with high probability.
翻译:子空间学习和矩阵要素化问题在科学和工程中有许多应用,高效的算法是关键因素,因为数据集的大小在继续增长。许多相关问题的配方都是非混凝土的,在各种背景下,发现直接解决非混凝土问题不仅有效,而且可靠准确。我们讨论特定方法的趋同理论:一阶递增梯度下降受格拉斯曼尼西亚人的限制。算法的输出是一个以输入流数据矩阵为间隔的美元维基点基础。我们研究两个抽样案例:流式矩阵的每个数据矢量都是完全抽样的,或者它被抽样矩阵的每个数据矢量不足,用$A_t\con_in\mathb{R ⁇ m\time n$直接标注。我们的结果包括两个案例,其中$A_t$是高压的,或者是身份矩阵的一组行。我们建议一个适应性步骤化计划,它只取决于抽样数据和算法输出。我们证明,每个抽样矩阵的矢量矢量都具有完全的概率性,每个步骤都是我们从精确的缩缩缩定的细的公式的细的细度组合,在每一个模型下, 最接近的细的细的公式中, 最接近的精确的精确的精确的细度的公式是,我们从每一个的精确的细化的细化的细化的细化的细化的细化的细化的细化的细制, 。