Learning requires both study and curiosity. A good learner is not only good at extracting information from the data given to it, but also skilled at finding the right new information to learn from. This is especially true when a human operator is required to provide the ground truth - such a source should only be queried sparingly. In this work, we address the problem of curiosity as it relates to online, real-time, human-in-the-loop training of an object detection algorithm onboard a robotic platform, one where motion produces new views of the subject. We propose a deep reinforcement learning approach that decides when to ask the human user for ground truth, and when to move. Through a series of experiments, we demonstrate that our agent learns a movement and request policy that is at least 3x more effective at using human user interactions to train an object detector than untrained approaches, and is generalizable to a variety of subjects and environments.
翻译:学习既需要学习,也需要好奇心。 优秀的学习者不仅善于从提供给它的数据中提取信息,而且善于寻找需要学习的正确新信息。 当需要人类操作者提供地面真相时, 这一点尤其正确, 这种来源应只是零散的查询。 在这项工作中, 我们处理好奇心问题, 因为它涉及到在机器人平台上在线、 实时、 人间流动的物体探测算法培训, 一个机器人平台上的物体探测算法, 一个机器人平台是运动产生新观点的平台。 我们提议了一种深度强化学习方法, 即决定何时向人类用户询问地面真相, 以及何时移动。 我们通过一系列实验, 证明我们的代理者学习了至少3x的政策, 该政策在使用人类用户互动来训练物体探测器方面比未经训练的方法更有效, 并且可以广泛适用于各种主题和环境。