Enabling autonomous agents to act cooperatively is an important step to integrate artificial intelligence in our daily lives. While some methods seek to stimulate cooperation by letting agents give rewards to others, in this paper we propose a method inspired by the stock market, where agents have the opportunity to participate in other agents' returns by acquiring reward shares. Intuitively, an agent may learn to act according to the common interest when being directly affected by the other agents' rewards. The empirical results of the tested general-sum Markov games show that this mechanism promotes cooperative policies among independently trained agents in social dilemma situations. Moreover, as demonstrated in a temporally and spatially extended domain, participation can lead to the development of roles and the division of subtasks between the agents.
翻译:使自主代理人能够采取合作行动是将人工智能融入我们日常生活的重要一步。 虽然有些方法试图通过让代理人奖励他人来刺激合作,但在本文件中,我们提出了一种由股票市场启发的方法,即代理人有机会通过获得奖励股份来参与其他代理人的回报。从直觉上看,代理人在受到其他代理人的奖励直接影响时,可能学会按照共同利益行事。经过测试的普通和马尔科夫游戏的经验性结果表明,这一机制促进了在社会困境中独立培训的代理人之间的合作政策。此外,正如在时间和空间上扩展的领域所显示的那样,参与可以导致代理人之间角色的发展和子任务分工。