General value functions (GVFs) in the reinforcement learning (RL) literature are long-term predictive summaries of the outcomes of agents following specific policies in the environment. Affordances as perceived action possibilities with specific valence may be cast into predicted policy-relative goodness and modelled as GVFs. A systematic explication of this connection shows that GVFs and especially their deep learning embodiments (1) realize affordance prediction as a form of direct perception, (2) illuminate the fundamental connection between action and perception in affordance, and (3) offer a scalable way to learn affordances using RL methods. Through an extensive review of existing literature on GVF applications and representative affordance research in robotics, we demonstrate that GVFs provide the right framework for learning affordances in real-world applications. In addition, we highlight a few new avenues of research opened up by the perspective of "affordance as GVF", including using GVFs for orchestrating complex behaviors.
翻译:强化学习(RL)文献中的一般价值功能(GVF)长期预测性地总结了在环境中采取特定政策的各种行为的结果,认为具有具体价值的行动可能性,可以将其转化为预测的政策-相对性良好,并仿照GVF。 系统地说明这一联系表明,GVF,特别是其深层次的学习化身,(1) 实现有偿的预测,作为直接认识的一种形式,(2) 说明支付能力的行动与感知之间的根本联系,(3) 提供一个用RL方法学习支付能力的基本联系,以及(3) 提供一个可扩展的方法。 通过广泛审查关于GVF应用的现有文献和机器人中具有代表性的支付能力研究,我们证明,GVF为在现实世界应用中学习支付能力提供了正确的框架。 此外,我们强调从“fordance as GVF”的角度所开辟的少数新的研究途径,包括利用GVF来调控复杂行为。