Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behavior diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our code and demo are available at https://sites.google.com/view/cole-2023.
翻译:合作人工智能(AI)中的零点协调仍然是一项重大挑战,这意味着与广泛的无形伙伴进行有效协调。以前的算法试图通过优化人口内部固定目标来应对这一挑战,以改善战略或行为多样性;然而,这些方法可能导致学习的丧失,无法与人口中的某些战略合作,被称为合作不相容。为解决这一问题,我们提议了 " 合作不限名额LEALING(COLE)框架 ",从图表理论的角度与两个参与者合作游戏,建立开放目标,从评估并确定每项战略的合作能力。我们进一步具体说明了框架,并提出了利用游戏理论和图表理论知识的实用算法。此外,对算法的学习过程的分析表明,它能够有效地克服合作不相容性。超常的游戏环境中的实验结果表明,在与不同级别的伙伴协调时,我们的方法超越了目前的状态方法。我们的代码和演示可在https://sites.gogle.com/view/ciole-2023上查阅。