解决基于图形的公共公益运动,同时进行树搜索和模拟学习 (Solving Graph-based Public Good Games with Tree Search and Imitation Learning)

Public goods games represent insightful settings for studying incentives for individual agents to make contributions that, while costly for each of them, benefit the wider society. In this work, we adopt the perspective of a central planner with a global view of a network of self-interested agents and the goal of maximizing some desired property in the context of a best-shot public goods game. Existing algorithms for this known NP-complete problem find solutions that are sub-optimal and cannot optimize for criteria other than social welfare. In order to efficiently solve public goods games, our proposed method directly exploits the correspondence between equilibria and the Maximal Independent Set (mIS) structural property of graphs. In particular, we define a Markov Decision Process which incrementally generates an mIS, and adopt a planning method to search for equilibria, outperforming existing methods. Furthermore, we devise a graph imitation learning technique that uses demonstrations of the search to obtain a graph neural network parametrized policy which quickly generalizes to unseen game instances. Our evaluation results show that this policy is able to reach 99.5% of the performance of the planning method while being three orders of magnitude faster to evaluate on the largest graphs tested. The methods presented in this work can be applied to a large class of public goods games of potentially high societal impact and more broadly to other graph combinatorial optimization problems.

翻译：公共产品游戏代表了研究激励个体代理商做出贡献的有见地的环境,这些激励对于每个代理商来说都是代价高昂的,但却有利于更广大的社会。在这项工作中,我们采纳了一个中心规划商的观点,从全球的角度看待一个自我利益代理商网络,并着眼于在最佳的公益游戏中最大限度地扩大某些理想财产。这个已知的NP不完整问题的现有算法找到了亚于最佳的解决方案,并且无法优化社会福利以外的标准。为了有效解决公益游戏,我们提出的方法直接利用了公平与最大独立游戏(MS)图表结构属性之间的对等。特别是,我们定义了一个具有全球视角的中央规划师的观点,以逐步生成一个自我利益代理商网络的网络,并制定了在最佳公益游戏中寻找某种理想财产的计划方法。此外,我们设计了一个图形模拟学习技术,利用搜索的演示来获得一个图形神经网络的分化政策,迅速概括到看不见的游戏实例。我们的评估结果表明,这一政策能够达到规划方法绩效的99.5%。我们定义了马尔科夫决策过程过程,逐渐生成一个MS,同时将快速地测试一个巨大的社会图像,可以快速地评估其他水平。