Active inference may be defined as Bayesian modeling of a brain with a biologically plausible model of the agent. Its primary idea relies on the free energy principle and the prior preference of the agent. An agent will choose an action that leads to its prior preference for a future observation. In this paper, we claim that active inference can be interpreted using reinforcement learning (RL) algorithms and find a theoretical connection between them. We extend the concept of expected free energy (EFE), which is a core quantity in active inference, and claim that EFE can be treated as a negative value function. Motivated by the concept of prior preference and a theoretical connection, we propose a simple but novel method for learning a prior preference from experts. This illustrates that the problem with inverse RL can be approached with a new perspective of active inference. Experimental results of prior preference learning show the possibility of active inference with EFE-based rewards and its application to an inverse RL problem.
翻译:主动推论可被定义为用生物上可信的物剂模型模拟脑部的贝叶斯式模型,其主要思想依赖于自由能源原则和代理人的先入为主的原则。代理商将选择导致其先前偏好未来观测的行动。在本文中,我们声称,主动推论可以使用强化学习算法来解释,并找到两者之间的理论联系。我们扩展了预期自由能源的概念(EFE),这是主动推论中的核心数量,并声称EFE可以被视为负值函数。受先前偏好和理论联系概念的驱使,我们提出了一种简单但新颖的方法来学习专家的先入为主的偏好。这表明,对RL的主动推论问题可以用新的积极推论角度来处理。先前偏爱学的实验结果表明,可能积极推导以EFE为基础的奖励及其应用到逆值问题。