We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals while deceiving outside observers about its true goal. We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies via linear programming. The proposed approach enables the agent to exhibit a variety of tunable deceptive behaviors while ensuring the satisfaction of probabilistic constraints on the behavior. We evaluate the performance of the proposed approach via comparative user studies and present a case study on the streets of Manhattan, New York, using real travel time distributions.
翻译:我们研究能够欺骗外部观察员的自主代理机构的设计,这些代理机构在从事随机、复杂环境中的任务时,能够欺骗外部观察员的意图。通过将该代理机构的行为作为马尔科夫决定程序的模型,我们考虑该代理机构的目标是实现多种潜在目标之一,同时让外部观察员了解其真实目标。我们提议一种新颖的办法,根据最大限度的加密原则,模拟观察员预测,并通过线性编程有效产生欺骗性战略。拟议办法使该代理机构能够展示各种易被欺骗的金枪鱼行为,同时确保满足行为的概率限制。我们通过比较用户研究,评估拟议方法的绩效,并利用实时旅行时间分布,在曼哈顿(纽约)街头进行案例研究。