基于深入强化学习的下一个最佳观察估计,用于主动物体分类 (Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification)

The presentation and analysis of image data from a single viewpoint are often not sufficient to solve a task. Several viewpoints are necessary to obtain more information. The next-best-view problem attempts to find the optimal viewpoint with the greatest information gain for the underlying task. In this work, a robot arm holds an object in its end-effector and searches for a sequence of next-best-view to explicitly identify the object. We use Soft Actor-Critic (SAC), a method of deep reinforcement learning, to learn these next-best-views for a specific set of objects. The evaluation shows that an agent can learn to determine an object pose to which the robot arm should move an object. This leads to a viewpoint that provides a more accurate prediction to distinguish such an object from other objects better. We make the code publicly available for the scientific community and for reproducibility.

翻译：从单一角度展示和分析图像数据往往不足以解决问题。要获取更多信息, 需要几种观点。下一个最佳观点问题试图找到最佳观点, 从而获得对基本任务的最大信息。在这项工作中, 机器人臂在最终效果上持有一个对象, 并搜索下一个最佳观点序列以明确识别对象。我们使用Soft Actor- Critic (SAC), 这是一种深层强化学习的方法, 来学习对特定对象组的这些次最佳观点。评估显示, 代理人可以学习确定机器人臂向哪个物体移动的物体构成的物体。这导致一种观点, 提供更准确的预测, 将此类物体与其他对象更好地区分开来。我们向科学界公开该代码, 供复制使用。