纠正反向强化学习中的特征表示诊断和增强 (Diagnosing and Augmenting Feature Representations in Correctional Inverse Reinforcement Learning)

Robots have been increasingly better at doing tasks for humans by learning from their feedback, but still often suffer from model misalignment due to missing or incorrectly learned features. When the features the robot needs to learn to perform its task are missing or do not generalize well to new settings, the robot will not be able to learn the task the human wants and, even worse, may learn a completely different and undesired behavior. Prior work shows how the robot can detect when its representation is missing some feature and can, thus, ask the human to be taught about the new feature; however, these works do not differentiate between features that are completely missing and those that exist but do not generalize to new environments. In the latter case, the robot would detect misalignment and simply learn a new feature, leading to an arbitrarily growing feature representation that can, in turn, lead to spurious correlations and incorrect learning down the line. In this work, we propose separating the two sources of misalignment: we propose a framework for determining whether a feature the robot needs is incorrectly learned and does not generalize to new environment setups vs. is entirely missing from the robot's representation. Once we detect the source of error, we show how the human can initiate the realignment process for the model: if the feature is missing, we follow prior work for learning new features; however, if the feature exists but does not generalize, we use data augmentation to expand its training and, thus, complete the correction. We demonstrate the proposed approach in experiments with a simulated 7DoF robot manipulator and physical human corrections.

翻译：机器人通过从人类的反馈中学习来更好地完成任务，但由于缺失或错误学习特征而经常出现模型不匹配。当机器人需要学习的特征缺失或无法推广到新环境时，机器人将无法学习人类所需的任务，甚至更糟的是，可能会学习完全不同和不需要的行为。先前的工作展示了机器人如何检测其表示缺少某些特征，并因此要求人类教授新特征；然而，这些工作没有区分完全缺失和存在但无法推广到新环境的特征。在后一种情况下，机器人将检测到错误的表示并学习新特征，导致任意增长的特征表示，这反过来会导致错误学习和虚假相关性。在这项工作中，我们提出将误差源分离：我们提出了一个框架，用于确定机器人需要学习的特征是否学习不正确且无法推广到新环境设置，还是完全缺失于机器人的表示。一旦我们检测到错误源，我们展示了人类如何启动模型的重新校准过程：如果特征缺失，我们遵循先前的工作学习新特征；然而，如果特征存在但无法推广，我们使用数据增强来扩展其训练，并因而完成纠正。我们通过对模拟7自由度机器人操纵器和人类物理校正的实验来演示拟议的方法。