In this work, we consider the problem of minimizing the sum of Moreau envelopes of given functions, which has previously appeared in the context of meta-learning and personalized federated learning. In contrast to the existing theory that requires running subsolvers until a certain precision is reached, we only assume that a finite number of gradient steps is taken at each iteration. As a special case, our theory allows us to show the convergence of First-Order Model-Agnostic Meta-Learning (FO-MAML) to the vicinity of a solution of Moreau objective. We also study a more general family of first-order algorithms that can be viewed as a generalization of FO-MAML. Our main theoretical achievement is a theoretical improvement upon the inexact SGD framework. In particular, our perturbed-iterate analysis allows for tighter guarantees that improve the dependency on the problem's conditioning. In contrast to the related work on meta-learning, ours does not require any assumptions on the Hessian smoothness, and can leverage smoothness and convexity of the reformulation based on Moreau envelopes. Furthermore, to fill the gaps in the comparison of FO-MAML to the Implicit MAML (iMAML), we show that the objective of iMAML is neither smooth nor convex, implying that it has no convergence guarantees based on the existing theory.
翻译:在这项工作中,我们考虑到将某些功能的莫罗封套之和最小化的问题,这以前是在元学习和个性化联合学习的背景下出现的。与现有的理论相反,在达到某种精确度之前,我们只能假设每次迭代都采取数量有限的梯度步骤。作为一个特例,我们的理论允许我们展示一极模型 -- -- 不可知的元学习(FO-MAML)与Moreau目标的解决方案相近之处的趋同。我们还研究一个更普遍的一级算法系列,可以被视为FO-MAML的普遍化。我们的主要理论成就是在不精确的SGD框架中的理论改进。特别是,我们的过敏率分析使我们能够更严格地保证改善对问题制约的依赖。与相关的元学习工作相比,我们不需要任何关于Hossian光滑度的假设,并且能够利用基于MOeau封的重整的平滑度和共性。此外,我们的主要理论是,在不精确的SGDMLMA框架上的理论性改进。我们没有将IMMA的IMMA与ML的理论加以平稳地解释。