How to behave efficiently and flexibly is a central problem for understanding biological agents and creating intelligent embodied AI. It has been well known that behavior can be classified as two types: reward-maximizing habitual behavior, which is fast while inflexible; and goal-directed behavior, which is flexible while slow. Conventionally, habitual and goal-directed behaviors are considered handled by two distinct systems in the brain. Here, we propose to bridge the gap between the two behaviors, drawing on the principles of variational Bayesian theory. We incorporate both behaviors in one framework by introducing a Bayesian latent variable called "intention". The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal. Building on this idea, we present a novel Bayesian framework for modeling behaviors. Our proposed framework enables skill sharing between the two kinds of behaviors, and by leveraging the idea of predictive coding, it enables an agent to seamlessly generalize from habitual to goal-directed behavior without requiring additional training. The proposed framework suggests a fresh perspective for cognitive science and embodied AI, highlighting the potential for greater integration between habitual and goal-directed behaviors.
翻译:如何高效灵活地行为是理解生物智能代理并创建智能化身人工智能的中心问题。已经广泛认识到行为可以被分类为两种类型:最大化奖励的习惯行为,它快速而不灵活;和以目标为导向的行为,后者灵活而缓慢。通常,习惯和目标导向的行为被认为由大脑中的两个不同系统处理。在这里,我们提出借鉴变分贝叶斯理论的原则来弥合两种行为之间的差距。我们通过引入一种贝叶斯隐变量称为“意图”,在一个框架中结合了习惯和目标导向的行为。习惯行为是通过使用意图的先验分布生成的,其没有目标;而以目标为导向的行为是通过意图的后验分布生成的,其以目标为条件。在这个基础上,我们提出了一种建模行为的新型贝叶斯框架。我们所提出的框架使这两种行为之间共享技能,并通过利用预测编码的理念,使代理人能够无需额外训练从习惯行为无缝地推广到目标导向的行为。所提出的框架为认知科学和智能化身人工智能提供了新的视角,突出了习惯和目标导向行为之间更大的整合潜力。