We propose a hybrid combination of active inference and behavior trees (BTs) for reactive action planning and execution in dynamic environments, showing how robotic tasks can be formulated as a free-energy minimization problem. The proposed approach allows handling partially observable initial states and improves the robustness of classical BTs against unexpected contingencies while at the same time reducing the number of nodes in a tree. In this work, we specify the nominal behavior offline, through BTs. However, in contrast to previous approaches, we introduce a new type of leaf node to specify the desired state to be achieved rather than an action to execute. The decision of which action to execute to reach the desired state is performed online through active inference. This results in continual online planning and hierarchical deliberation. By doing so, an agent can follow a predefined offline plan while still keeping the ability to locally adapt and take autonomous decisions at runtime, respecting safety constraints. We provide proof of convergence and robustness analysis, and we validate our method in two different mobile manipulators performing similar tasks, both in a simulated and real retail environment. The results showed improved runtime adaptability with a fraction of the hand-coded nodes compared to classical BTs.
翻译:我们提议将活跃的推断与行为树(BTs)混合起来,以便在动态环境中进行反应性行动规划和执行,显示机器人任务如何被设计成一个自由能源最小化的问题。拟议方法允许处理部分可观测的初步状态,提高传统BTs对意外意外事故的稳健性,同时减少树上节点的数量。在这项工作中,我们通过BTs指定了名义离线行为。然而,与以前的做法不同,我们引入了一种新的叶节点,以具体说明需要达到的状态,而不是要采取行动执行。执行达到理想状态的行动是通过积极的推断在线进行。这导致持续的在线规划和等级评分。通过这样做,一个代理可以遵循预先界定的离线计划,同时保持本地适应和在运行时自主决策的能力,同时尊重安全限制。我们提供了趋同性和稳健性分析的证据,并且我们验证了在两个不同的移动操纵器中执行类似任务的方法,两者都是在模拟的和实际零售环境中执行的。结果显示,运行时的适应性适应能力得到了改进,而光标与光标的B节段相比。