作为新兴行为框架的多机构加强多机构强化学习中分享关系网 (Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior)

from arxiv, Presented at Adaptive and Learning Agents Workshop at AAMAS, London, UK (Virtual), visit https://sites.google.com/view/marl-rsrn for videos and more information

In this work, we integrate `social' interactions into the MARL setup through a user-defined relational network and examine the effects of agent-agent relations on the rise of emergent behaviors. Leveraging insights from sociology and neuroscience, our proposed framework models agent relationships using the notion of Reward-Sharing Relational Networks (RSRN), where network edge weights act as a measure of how much one agent is invested in the success of (or `cares about') another. We construct relational rewards as a function of the RSRN interaction weights to collectively train the multi-agent system via a multi-agent reinforcement learning algorithm. The performance of the system is tested for a 3-agent scenario with different relational network structures (e.g., self-interested, communitarian, and authoritarian networks). Our results indicate that reward-sharing relational networks can significantly influence learned behaviors. We posit that RSRN can act as a framework where different relational networks produce distinct emergent behaviors, often analogous to the intuited sociological understanding of such networks.

翻译：在这项工作中,我们通过用户定义的关系网络,将`社会'互动纳入MARL的设置中,并审查代理代理关系对突发行为上升的影响。利用社会学和神经科学的见解,我们提议的框架模式代理关系,利用奖励分享关系网络的概念,即网络边缘权重作为衡量一个代理对另一个代理成功(或`关心' )投资多少的尺度。我们把关系奖励作为RSRN互动权重的一项功能,以便通过多剂强化学习算法,集体培训多剂系统。该系统的性能测试了具有不同关系网络结构(如自我利益、社群和专制网络)的三剂情景。我们的结果表明,报酬分享关系网络可以极大地影响学到的行为。我们假设,RSRN可以作为一个框架,让不同的关系网络产生明显的突发行为,通常类似于对此类网络的不适当的社会学理解。