The presentation and analysis of image data from a single viewpoint are often not sufficient to solve a task. Several viewpoints are necessary to obtain more information. The $\textit{next-best-view}$ problem attempts to find the optimal viewpoint with the greatest information gain for the underlying task. In this work, a robot arm holds an object in its end-effector and searches for a sequence of next-best-view to explicitly identify the object. We use Soft Actor-Critic (SAC), a method of deep reinforcement learning, to learn these next-best-views for a specific set of objects. The evaluation shows that an agent can learn to determine an object pose to which the robot arm should move an object. This leads to a viewpoint that provides a more accurate prediction to distinguish such an object from other objects better. We make the code publicly available for the scientific community and for reproducibility under $\href{https://github.com/ckorbach/nbv_rl}{\text{this https link}}$.
翻译:从单一角度展示和分析图像数据往往不足以解决问题。 要获取更多信息, 需要几种观点。 $\ textit{ next- best- view} 问题在于 $\ textit{ next- best- view} $$ 试图找到最佳观点, 从而获得最丰富的基本任务信息 。 在这项工作中, 机器人臂在其最终效果中持有一个对象, 并搜索下一个最佳视图序列以明确识别对象 。 我们使用 Soft Actor- Critic (SAC), 一种深度强化学习的方法, 来学习这些特定对象的下一个最佳视图 。 评估显示, 代理人可以学习如何确定机器人手臂移动对象的形状 。 这导致一种观点, 能提供更准确的预测, 将这种对象与其他对象更好地区分 。 我们向科学界提供代码, 并在 $\href{ https:// github. com/ cockorbach/ nbv_runtext{ this https link_ $ 下 。