Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys $\mathcal{O}(\epsilon^{-3/2})$ in sample and communication complexities for achieving an $\epsilon$-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings.
翻译:最近,由于它在许多新兴的多试剂学习模式(例如多试剂元学习和多试剂强化学习)中对于同行对等边缘网络的基本作用(例如多试剂元学习和多试剂强化学习),分散的双级优化最近受到越来越多的关注。然而,为了与边缘网络有限的计算和通信能力合作,开发分散的双级优化技术的主要挑战是降低抽样和通信复杂性。这促使我们开发一个新的分散的双级优化,称为DIAMOND(通过动力和梯度跟踪,分散的单一时间级随机近似)。本文的贡献如下:(一) 我们的DIAMOND算法采用了单层结构,而不是遵循双层优化的自然双层结构,这种结构提供较低的计算和执行复杂性;(二) 与现有方法相比,DIAMOND算法不需要任何全面的梯度评估,从而进一步降低抽样和计算复杂性;三)通过仔细整合动力信息和梯度跟踪技术,我们表明,DIAMOND算法的算法也享有$-mathcal{O}(epal_-3/2}的单层优化结构结构结构,这是我们现有数据库的样本和数据库的大幅核查。