Optimization problems on the generalized Stiefel manifold (and products of it) are prevalent across science and engineering. For example, in computational science they arise in the symmetric (generalized) eigenvalue problem, in nonlinear eigenvalue problems, and in electronic structures computations, to name a few problems. In statistics and machine learning, they arise, for example, in various dimensionality reduction techniques such as canonical correlation analysis. In deep learning, regularization and improved stability can be obtained by constraining some layers to have parameter matrices that belong to the Stiefel manifold. Solving problems on the generalized Stiefel manifold can be approached via the tools of Riemannian optimization. However, using the standard geometric components for the generalized Stiefel manifold has two possible shortcoming: computing some of the geometric components can be too expensive and converge can be rather slow in certain cases. Both shortcomings can be addressed using a technique called Riemannian preconditioning, which amounts to using geometric components derived using a precoditioner that defines a Riemannian metric on the constraint manifold. In this paper we develop the geometric components required to perform Riemannian optimization on the generalized Stiefel manifold equipped with a non-standard metric, and illustrate theoretically and numerically the use of those components and the effect of Riemannian preconditioning for solving optimization problems on the generalized Stiefel manifold.
翻译:科学与工程领域普遍存在Stiefel综合体(及其产品)的优化问题。例如,在科学与工程领域,普遍Stiefel综合体(及其产品)的优化问题十分普遍。例如,在计算科学领域,在对称(一般化)结构值问题、非线性亚值问题、电子结构计算中产生的对称性问题,以列举几个问题。在统计和机器学习方面,在诸如Canonical相关分析等各种维度降低技术中产生的对称性问题。在深层学习、正规化和改善稳定性方面,可以通过限制某些层次拥有属于Stiefel综合体的参数矩阵。可以通过Riemannian优化工具解决通用Stiefel综合体的对普局性问题。然而,使用通用Stiefel综合系统的标准几部分有两个可能的缺点:计算某些几何组成部分的费用可能太高,在某些情况下可能比较缓慢。这两个缺点都可以用称为Riemannimann先决条件的技术来解决的几何要素,这相当于用一个前科化的里曼标准度参数来界定Stierifenrifenrienrien 矩阵的对Stialstalstitalstitalstitalstital romatiquestital romatiumstital eximmal eximp eximact eximmal eximpol eximact eximpol ex ex ex ex ex ex eximmal eximmact eximact eximact eximact eximact eximact eximact eximtiquestital eximtime eximmatiquematime eximtiquestitutismatiquestitutismatime eximpoltime eximp ex ex ex ex ex ex ex eximpeal 。我们,我们,我们用了这些要求使用这些要求使用这些硬化的对一些要求的不要求的对一些要求的不进行进行进行进行进行这些要求的标准化的不要求的对一些要求的标准化的标准化的对一些要求的标准化的对等的不作作作进行不作解释性分析分析分析分析的标准化性地基的