In this paper, we study smooth stochastic multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $\epsilon$-stationary point of the problem. We show that the first algorithm, which is a generalization of \cite{GhaRuswan20} to the $T$ level case, can achieve a sample complexity of $\mathcal{O}(1/\epsilon^6)$ by using mini-batches of samples in each iteration. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}(1/\epsilon^4)$. {\color{black}This modification not only removes the requirement of having a mini-batch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement}. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multi-level setting, obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.
翻译:在本文中, 我们研究平滑的随机多层次构成优化配置问题, 目标函数是 $T$ 函数的嵌套构成 。 我们假设通过随机第一个阶梯, 对函数及其梯度进行杂音评估。 为了解决这一类问题, 我们建议了两种算法, 使用移动平均随机学估计, 并分析它们与 $\ epsilon$- 静止问题点的趋同性点的趋同性。 我们显示, 第一个算法, 也就是将\ cite{ GhaRuswan20} 概括成 $T$ 的立案。 我们假设的目标函数及其梯度通过在每次迭代中使用迷你杯样本来获得对 $\ mathcal{ O} (1/\\ epsilon_ 6) 的混杂复杂度的抽样评估。 我们用线性平均随机估计值的算法将样本复杂性提高到 $\ mathal { O} (1/\ epsllonl) 4) 。 相同 。 这种修改不仅消除了在每个定序的精度的精度的精度的精度的精度要求,, 也是在这种定的精度下的精度下的精度, 。