Real-world applications require a robot operating in the physical world with awareness of potential risks besides accomplishing the task. A large part of risky behaviors arises from interacting with objects in ignorance of affordance. To prevent the agent from making unsafe decisions, we propose to train a robotic agent by reinforcement learning to execute tasks with an awareness of physical properties such as mass and friction in an indoor environment. We achieve this through a novel physics-inspired reward function that encourages the agent to learn a policy discerning different masses and friction coefficients. We introduce two novel and challenging indoor rearrangement tasks -- the variable friction pushing task and the variable mass pushing task -- that allow evaluation of the learned policies in trading off performance and physics-inspired risk. Our results demonstrate that by equipping with the proposed reward, the agent is able to learn policies choosing the pushing targets or goal-reaching trajectories with minimum physical cost, which can be further utilized as a precaution to constrain the agent's behavior in a safety-critic environment.
翻译:现实世界应用要求机器人在物理世界中操作,除了完成这项任务外还意识到潜在风险。 大部分风险行为来自与无负担对象的相互作用。 为了防止代理商做出不安全的决定,我们提议通过强化学习来培训机器人代理商,让他们了解室内环境中的大规模和摩擦等物理特性。 我们通过由物理启发的新颖的奖励功能来实现这一目标,鼓励代理商学习辨别不同质量和摩擦系数的政策。 我们引入了两种新颖和具有挑战性的室内重新布局任务 -- -- 可变的摩擦推动任务和可变的大规模推移任务 -- -- 以便评估在性能交易和物理激励风险中学习的政策。 我们的结果表明,通过提供拟议奖励,该代理商能够以最低的体力成本学习选择推力目标或达到目标的轨迹的政策,这可以被进一步用作一种防范措施,以限制代理商在安全环境的行为。