Recently, convex nested stochastic composite optimization (NSCO) has received considerable attention for its applications in reinforcement learning and risk-averse optimization. The current NSCO algorithms have worse stochastic oracle complexities, by orders of magnitude, than those for simpler stochastic composite optimization problems (e.g., sum of smooth and nonsmooth functions) without the nested structure. Moreover, they require all outer-layer functions to be smooth, which is not satisfied by some important applications. These discrepancies prompt us to ask: ``does the nested composition make stochastic optimization more difficult in terms of the order of oracle complexity?" In this paper, we answer the question by developing order-optimal algorithms for the convex NSCO problem constructed from an arbitrary composition of smooth, structured non-smooth and general non-smooth layer functions. When all outer-layer functions are smooth, we propose a stochastic sequential dual (SSD) method to achieve an oracle complexity of $\mathcal{O}(1/\epsilon^2)$ ($\mathcal{O}(1/\epsilon)$) when the problem is non-strongly (strongly) convex. When there exists some structured non-smooth or general non-smooth outer-layer function, we propose a nonsmooth stochastic sequential dual (nSSD) method to achieve an oracle complexity of $\mathcal{O}(1/\epsilon^2)$. We provide a lower complexity bound to show the latter $\mathcal{O}(1/\epsilon^2)$ complexity to be unimprovable even under a strongly convex setting. All these complexity results seem to be new in the literature and they indicate that the convex NSCO problem has the same order of oracle complexity as those without the nested composition in all but the strongly convex and outer-non-smooth problem.
翻译:最近, convex 嵌入式复合优化( NSCO) 因其在强化学习和风险反向优化方面的应用而得到了相当的关注。 目前的 NSCO 算法在数量上比在不嵌入结构的情况下, 更简单的随机混合优化问题( 例如, 平滑和不移动功能的总和) 。 此外, 它们要求所有外层功能都平滑, 但有些重要应用程序无法满足 。 这些差异促使我们询问 : " 使嵌入式的构成使得在奥氏复杂程度方面更难进行随机优化? 在本文件中,我们通过为NSCO 问题建立更简单、 结构化、 平滑和一般非移动的复合优化 。 当所有外层功能都平滑时, 我们建议一种直系性双层( SSD) 方法, 以达到 $( 1/ epcial) 的新的复杂程度 。 (\\\ lical=2) 美元( 美元) 和 直系非直径( roal) 直径( roal) rodeal) 的不具有非直径( ro) 直径( ro) 直径解( 直径解) 或直径解( 直径解) 直径解) 直系) 的功能, 的不具有非硬 直径解( 直系) 直系) 。