更多:同时的多视图 3D 对象识别和粒子估计 (MORE: Simultaneous Multi-View 3D Object Recognition and Pose Estimation)

Simultaneous object recognition and pose estimation are two key functionalities for robots to safely interact with humans as well as environments. Although both object recognition and pose estimation use visual input, most state-of-the-art tackles them as two separate problems since the former needs a view-invariant representation while object pose estimation necessitates a view-dependent description. Nowadays, multi-view Convolutional Neural Network (MVCNN) approaches show state-of-the-art classification performance. Although MVCNN object recognition has been widely explored, there has been very little research on multi-view object pose estimation methods, and even less on addressing these two problems simultaneously. The pose of virtual cameras in MVCNN methods is often predefined in advance, leading to bound the application of such approaches. In this paper, we propose an approach capable of handling object recognition and pose estimation simultaneously. In particular, we develop a deep object-agnostic entropy estimation model, capable of predicting the best viewpoints of a given 3D object. The obtained views of the object are then fed to the network to simultaneously predict the pose and category label of the target object. Experimental results showed that the views obtained from such positions are descriptive enough to achieve a good accuracy score. Furthermore, we designed a real-life serve drink scenario to demonstrate how well the proposed approach worked in real robot tasks. Code is available online at: github.com/SubhadityaMukherjee/more_mvcnn

翻译：同时的物体识别和估计是机器人安全地与人类和环境互动的两个关键功能。虽然对物体的识别和估计都使用视觉输入,但大多数最先进的技术将这两个问题作为两个不同的问题处理,因为前者需要视觉变化的表达方式,而物体则需要以视觉为依存的描述。现在,多视图相向神经网络(MVCNN)方法显示的是最新分类性能。虽然对MVCNN物体的识别进行了广泛探讨,但对多视图对象的识别提出了估算方法,而同时处理这两个问题的研究则更少。MVCNN方法中的虚拟相机的构成往往是预先界定的,从而束缚了这些方法的应用。在本文中,我们提出了一个能够同时处理物体识别和估计的方法。特别是,我们开发了一个深度的天体-认知性昆虫估计模型,能够预测特定3D对象的最佳观点。我们随后获得的物体观点被输入到网络中,以同时预测配置/分类目标的配置方法/分类。MVCNN方法中的虚拟相机的构成方式往往预先界定了MVCNN方法的形状,从而约束了这些方法的应用。我们提出了一种能够同时处理物体的准确度。我们所设计的模型的模型的模型,从而展示了一种正确的模型。我们如何从真实的轨道定位,从而展示了一种正确的选择。