Single-view RGB-D human reconstruction with implicit functions is often formulated as per-point classification. Specifically, a set of 3D locations within the view-frustum of the camera are first projected independently onto the image and a corresponding feature is subsequently extracted for each 3D location. The feature of each 3D location is then used to classify independently whether the corresponding 3D point is inside or outside the observed object. This procedure leads to sub-optimal results because correlations between predictions for neighboring locations are only taken into account implicitly via the extracted features. For more accurate results we propose the occupancy planes (OPlanes) representation, which enables to formulate single-view RGB-D human reconstruction as occupancy prediction on planes which slice through the camera's view frustum. Such a representation provides more flexibility than voxel grids and enables to better leverage correlations than per-point classification. On the challenging S3D data we observe a simple classifier based on the OPlanes representation to yield compelling results, especially in difficult situations with partial occlusions due to other objects and partial visibility, which haven't been addressed by prior work.
翻译:带有隐含功能的单视 RGB- D 人类重建往往按点分类。 具体地说, 相机视图- 曲线内一组三维位置首先独立投射到图像上, 然后为每个 3D 位置绘制相应的特征。 然后, 每个 3D 位置的特征被用来独立分类对应的 3D 点是否在被观察对象内外。 这个程序导致亚最佳结果, 因为对相邻位置的预测之间的相关性只能通过提取的特征被隐含地考虑在内。 为了取得更准确的结果, 我们提议了占有权( OPlanes) 代表方( OPlanes), 以便能够将单个视图 RGB- D 人类重建作为在切除相机视角外的平面上的占用预测。 这种代表比 voxel 网格具有更大的灵活性, 并且能够比 perpoint 分类更好的杠杆联系。 在具有挑战性的 S3D 数据上, 我们观察到一个基于OPlanes 代表方的简单分类器, 以便产生令人信服的结果,, 特别是由于其他物体和部分可见度而部分分离而先前工作尚未解决的艰难的情况 。