存储物剂网络中由名声驱动的决策 (Reputation-driven Decision-making in Networks of Stochastic Agents)

This paper studies multi-agent systems that involve networks of self-interested agents. We propose a Markov Decision Process-derived framework, called RepNet-MDP, tailored to domains in which agent reputation is a key driver of the interactions between agents. The fundamentals are based on the principles of RepNet-POMDP, a framework developed by Rens et al. in 2018, but addresses its mathematical inconsistencies and alleviates its intractability by only considering fully observable environments. We furthermore use an online learning algorithm for finding approximate solutions to RepNet-MDPs. In a series of experiments, RepNet agents are shown to be able to adapt their own behavior to the past behavior and reliability of the remaining agents of the network. Finally, our work identifies a limitation of the framework in its current formulation that prevents its agents from learning in circumstances in which they are not a primary actor.

翻译：本文研究涉及自我利益代理人网络的多试剂系统。我们提议了一个Markov决定进程衍生框架,称为RepNet-MDP,专门针对代理人声誉是代理人之间相互作用关键驱动力的领域。基本原理基于RepNet-POMDP的原则,这是Rens等人在2018年开发的一个框架,但解决其数学不一致问题,并仅考虑完全可观察的环境,从而减轻其吸引力。我们还使用在线学习算法,为RepNet-MDP寻找近似的解决办法。在一系列实验中,RepNet代理人已证明能够使自己的行为适应网络其余代理人过去的行为和可靠性。最后,我们的工作确定了目前拟订的框架的局限性,使代理人在不是主要行为者的情况下无法学习。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日