Bilevel optimization recently has received tremendous attention due to its great success in solving important machine learning problems like meta learning, reinforcement learning, and hyperparameter optimization. Extending single-agent training on bilevel problems to the decentralized setting is a natural generalization, and there has been a flurry of work studying decentralized bilevel optimization algorithms. However, it remains unknown how to design the distributed algorithm with sample complexity and convergence rate comparable to SGD for stochastic optimization, and at the same time without directly computing the exact Hessian or Jacobian matrices. In this paper we propose such an algorithm. More specifically, we propose a novel decentralized stochastic bilevel optimization (DSBO) algorithm that only requires first order stochastic oracle, Hessian-vector product and Jacobian-vector product oracle. The sample complexity of our algorithm matches the currently best known results for DSBO, and the advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-iteration complexity.
翻译:双边优化最近因其在解决重要机器学习问题,如元学习、强化学习和超参数优化方面取得巨大成功而得到了极大关注。 将双级问题单一试剂培训推广到分散化环境是一种自然的简单化。 研究分散化双级优化算法的工作十分繁忙。 但是,如何设计分布式算法,其样本复杂程度和趋同率与SGD在随机优化方面的相同,同时不直接计算准确的赫森或雅各布基体。 在本文中,我们提出了这样一个算法。 更具体地说,我们建议采用一个新的分散化的双级双级优化(DSBO)算法(DSBO ) 。 我们算法的抽样复杂程度与目前已知的DSBO最佳结果相匹配,我们的算法优势在于它不需要估算完整的赫森和雅各布基体矩阵,从而改进了每个语言的复杂性。