This work considers optimization of composition of functions in a nested form over Riemannian manifolds where each function contains an expectation. This type of problems is gaining popularity in applications such as policy evaluation in reinforcement learning or model customization in meta-learning. The standard Riemannian stochastic gradient methods for non-compositional optimization cannot be directly applied as stochastic approximation of inner functions create bias in the gradients of the outer functions. For two-level composition optimization, we present a Riemannian Stochastic Composition Gradient Descent (R-SCGD) method that finds an approximate stationary point, with expected squared Riemannian gradient smaller than $\epsilon$, in $O(\epsilon^{-2})$ calls to the stochastic gradient oracle of the outer function and stochastic function and gradient oracles of the inner function. Furthermore, we generalize the R-SCGD algorithms for problems with multi-level nested compositional structures, with the same complexity of $O(\epsilon^{-2})$ for the first-order stochastic oracle. Finally, the performance of the R-SCGD method is numerically evaluated over a policy evaluation problem in reinforcement learning.
翻译:这项工作将考虑以嵌套形式对每个功能都包含期望的里曼式多元形的功能构成的优化。 此类问题在强化学习中的政策评价或元学习中的模型定制等应用中越来越受欢迎。 标准不组合优化的里曼尼的随机梯度方法不能直接应用, 因为内函数的随机近似会在外函数的梯度上产生偏差。 对于两个层次的构成优化, 我们推出一种里曼尼亚小孔构成梯度( R- SCGD) 方法, 找到一个近乎固定点, 其预期的里曼梯度小于$( $\ epsilon ⁇ ⁇ 2}) 。 标准里曼梯度方法不能直接应用, 因为内函数的外函数和随机函数和梯度的梯度近似近似性近似近似性近似性, 外函数的梯度产生偏差性偏差。 此外, 我们将R- SCGD 算法用于第一级政策强化度( $O ( epslon) -2} 的复杂度方法, 。 最后, 将 AStochachacal assup assupevol assup assup assupledge assup assess 的演化方法的性。