Helmmholtz 连续运动会分解成像上的牛顿优化 (Newton Optimization on Helmholtz Decomposition for Continuous Games)

Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take into account the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this paper, we propose NOHD (Newton Optimization on Helmholtz Decomposition), a Newton-like algorithm for multi-agent learning problems based on the decomposition of the dynamics of the system in its irrotational (Potential) and solenoidal (Hamiltonian) component. This method ensures quadratic convergence in purely irrotational systems and pure solenoidal systems. Furthermore, we show that NOHD is attracted to stable fixed points in general multi-agent systems and repelled by strict saddle ones. Finally, we empirically compare the NOHD's performance with that of state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment.

翻译：许多学习问题涉及多个代理商优化不同互动功能。在这些问题中,标准的政策梯度算法由于每个代理商的设置和不同利益不固定而未能实现。事实上,算法必须考虑到这些系统的复杂动态,以保证迅速接近(本地)纳什均衡。在本文中,我们建议NOHD(Helmholtz Decomposition 上的Newton优化),一种像牛顿式的多试剂学习问题算法,这种算法基于系统动力分解在其渗透(Potential)和单向(Hamiltonian)部分中的分解。这种方法确保了纯循环系统和纯单向单向系统中的二次趋同。此外,我们表明NOHD被吸引到一般多剂系统中稳定的固定点,并被严格的马鞍击退。最后,我们从经验上将NOHD的性能与一些双马座游戏和连续的格德世界环境中的状态算法的性比较。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日