The Free Energy Principle (FEP) postulates that biological agents perceive and interact with their environment in order to minimize a Variational Free Energy (VFE) with respect to a generative model of their environment. The inference of a policy (future control sequence) according to the FEP is known as Active Inference (AIF). The AIF literature describes multiple VFE objectives for policy planning that lead to epistemic (information-seeking) behavior. However, most objectives have limited modeling flexibility. This paper approaches epistemic behavior from a constrained Bethe Free Energy (CBFE) perspective. Crucially, variational optimization of the CBFE can be expressed in terms of message passing on free-form generative models. The key intuition behind the CBFE is that we impose a point-mass constraint on predicted outcomes, which explicitly encodes the assumption that the agent will make observations in the future. We interpret the CBFE objective in terms of its constituent behavioral drives. We then illustrate resulting behavior of the CBFE by planning and interacting with a simulated T-maze environment. Simulations for the T-maze task illustrate how the CBFE agent exhibits an epistemic drive, and actively plans ahead to account for the impact of predicted outcomes. Compared to an EFE agent, the CBFE agent incurs expected reward in significantly more environmental scenarios. We conclude that CBFE optimization by message passing suggests a general mechanism for epistemic-aware AIF in free-form generative models.
翻译:自由能源原则(FEP)的假设是,生物物剂对其环境感到并与其环境互动,以尽量减少其环境的基因模型的变异自由能源(VFE),根据FEP的政策(未来控制序列)的推论被称为主动推断。AIF文献描述了导致认知(寻求信息)行为的政策规划的多种VFE目标。然而,大多数目标的模型灵活性有限。本文从限制的BeFE(CBFE)的角度来看待典型行为,以最大限度地减少其环境的变异自由能源(VFE)模式。很显然,CBFEFE的变异优化可以用传递自由形式基因模型的信息来表达。CBEFE的主要直觉是,我们对预测结果施加了点-质约束,这明确了该物剂今后将进行观察的假设。我们用其构成行为动力来解释CBFE的目标。我们随后通过与模拟的T-MAE(C)环境的规划和互动模式来说明CBFE的演化行为模式。我们用AFE-CA的缩略图案预测性预估A-A-CFE-CFIFIFIFIFA的预测结果的预结果,我们通过对A-C-C-C-C-CFIFIFIFIFIFA-S-S-S-C-C-FIFIFA-FA的预测结果的预结果的预测结果的预结果的预结果的模拟的模拟的预图的模拟的精确图进行精确图的精确的精确图。