We present a classification based approach for the next best view selection and show how we can plausibly obtain a supervisory signal for this task. The proposed approach is end-to-end trainable and aims to get the best possible 3D reconstruction quality with a pair of passively acquired 2D views. The proposed model consists of two stages: a classifier and a reconstructor network trained jointly via the indirect 3D supervision from ground truth voxels. While testing, the proposed method assumes no prior knowledge of the underlying 3D shape for selecting the next best view. We demonstrate the proposed method's effectiveness via detailed experiments on synthetic and real images and show how it provides improved reconstruction quality than the existing state of the art 3D reconstruction and the next best view prediction techniques.
翻译:我们为下一个最佳视图选择提出了一个基于分类的方法,并表明我们如何能够为这项任务获得监督信号。提议的方法是端到端的训练,目的是以一对被动获得的二维观点获得最佳的三维重建质量。提议的模式由两个阶段组成:一个分类者和一个重建者网络,通过间接的三维监督,从地面从真理Voxel进行联合培训。在测试时,拟议的方法假定在选择下一个最佳观点时没有事先对三维基本形状的了解。我们通过对合成图像和真实图像进行详细试验来证明拟议方法的有效性,并表明它如何提供比目前第三维重建状态更好的重建质量,以及下一个最佳视觉预测技术。