Planning is an extraordinary ability in which the brain imagines and then enacts evaluated possible futures. Using traditional planning models, computer scientists have attempted to replicate this capacity with some level of success but ultimately face a reoccurring limitation: as the plan grows in steps, the number of different possible futures makes it intractable to determine the right sequence of actions to reach a goal state. Based on prior theoretical work on how the ecology of an animal governs the value of spatial planning, we developed a more efficient biologically-inspired planning algorithm, TLPPO. This algorithm allows us to achieve mouselevel predator evasion performance with orders of magnitude less computation than a widespread algorithm for planning in the situations of partial observability that typify predator-prey interactions. We compared the performance of a real-time agent using TLPPO against the performance of live mice, all tasked with evading a robot predator. We anticipate these results will be helpful to planning algorithm users and developers, as well as to areas of neuroscience where robot-animal interaction can provide a useful approach to studying the basis of complex behaviors.
翻译:规划是一种非凡的能力,大脑在其中想象并随后制定评估可能的未来。 使用传统的规划模型,计算机科学家试图复制这种能力,但最终会面临一个重复的限制:随着计划逐步增长,不同可能的未来数量使得难以确定达到目标状态的正确行动顺序。根据先前关于动物生态如何调节空间规划价值的理论工作,我们开发了一种效率更高的生物激励规划算法(TLPPPO ) 。这种算法让我们得以在部分可耐性情况下实现鼠级捕食者逃逸性能,其数量小于在确定捕食者-先天相互作用的情况下进行广泛算法。我们比较了使用TLPPO的实时代理的性能与活老鼠的性能,所有这些都是用来躲避机器人捕食者的。我们预计这些结果将有助于对算法使用者和开发者进行规划,以及神经科学领域,在那里机器人-动物互动可以提供有用的方法来研究复杂行为的基础。