Over the last 10 to 15 years, active inference has helped to explain various brain mechanisms from habit formation to dopaminergic discharge and even modelling curiosity. However, the current implementations suffer from an exponential (space and time) complexity class when computing the prior over all the possible policies up to the time-horizon. Fountas et al (2020) used Monte Carlo tree search to address this problem, leading to impressive results in two different tasks. In this paper, we present an alternative framework that aims to unify tree search and active inference by casting planning as a structure learning problem. Two tree search algorithms are then presented. The first propagates the expected free energy forward in time (i.e., towards the leaves), while the second propagates it backward (i.e., towards the root). Then, we demonstrate that forward and backward propagations are related to active inference and sophisticated inference, respectively, thereby clarifying the differences between those two planning strategies.
翻译:在过去10至15年中,积极推断有助于解释各种大脑机制,从习惯形成到多巴胺释放,甚至模拟好奇心,但是,在计算所有可能的政策之前直至时间-光柱时,目前的执行过程会受到指数(空间和时间)复杂等级的影响。Fountas等人(202020年)利用蒙特卡洛树搜索来解决这个问题,导致在两项不同任务中取得令人印象深刻的结果。在本文件中,我们提出了一个替代性框架,目的是通过将规划作为一个结构学习问题来统一树木搜索和积极推断。然后提出了两个树搜索算法。第一个在时间(即向叶叶叶)前传播预期的免费能量,第二个则向后(即向根根)传播。然后,我们证明前向和后向的传播分别与积极的推断和复杂的推断有关,从而澄清了这两个规划战略之间的差异。