In this article, we propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC), designing an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards, embodying the principles of planning-as-inference. Concretely, we craft an adaptive agent system, which we call active predictive coding (ActPC), that balances an internally-generated epistemic signal (meant to encourage intelligent exploration) with an internally-generated instrumental signal (meant to encourage goal-seeking behavior) to ultimately learn how to control various simulated robotic systems as well as a complex robotic arm using a realistic robotics simulator, i.e., the Surreal Robotics Suite, for the block lifting task and can pick-and-place problems. Notably, our experimental results demonstrate that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
翻译:在文章中,我们提议通过神经神经基因编码神经认知计算框架(NGC),对机器人控制采取不反向反向反向推进的办法,设计一个完全由强大的预测编码/处理电路建立的代理,这种电路能够促进动态,在线学习稀有的回报,体现规划-推断原则。具体地说,我们设计了一个适应性代理系统,我们称之为主动预测编码(AcPC),它平衡了内部产生的显微信号(意在鼓励智能探索)与内部产生的工具信号(意在鼓励追求目标的行为),以便最终学会如何使用现实的机器人模拟器(即Surreal机器人套件)控制各种模拟机器人系统和复杂的机器人臂,以便进行截取任务,并能够挑选和安置问题。值得注意的是,我们的实验结果表明,我们提议的APC代理在面临稀疏(极端)奖励信号的情况下表现良好,并且与若干强大的后丙基方法具有竞争力或超出竞争能力。