In this paper, we present a spatial rectifier to estimate surface normals of tilted images. Tilted images are of particular interest as more visual data are captured by arbitrarily oriented sensors such as body-/robot-mounted cameras. Existing approaches exhibit bounded performance on predicting surface normals because they were trained using gravity-aligned images. Our two main hypotheses are: (1) visual scene layout is indicative of the gravity direction; and (2) not all surfaces are equally represented by a learned estimator due to the structured distribution of the training data, thus, there exists a transformation for each tilted image that is more responsive to the learned estimator than others. We design a spatial rectifier that is learned to transform the surface normal distribution of a tilted image to the rectified one that matches the gravity-aligned training data distribution. Along with the spatial rectifier, we propose a novel truncated angular loss that offers a stronger gradient at smaller angular errors and robustness to outliers. The resulting estimator outperforms the state-of-the-art methods including data augmentation baselines not only on ScanNet and NYUv2 but also on a new dataset called Tilt-RGBD that includes considerable roll and pitch camera motion.
翻译:在本文中, 我们展示了一个空间校正仪来估计倾斜图像的表面正常度。 倾斜图像特别令人感兴趣, 因为更多的视觉数据被任意定向传感器, 如机体/ robot 挂载相机所捕捉到。 现有方法显示在预测表面正常度时有约束性性性, 因为它们是使用重力校准图像来训练的。 我们的两个主要假设是:(1) 视觉场景布局显示重力方向; 和 (2) 并非所有表面都同样由学习的测深仪来代表, 因为培训数据有条不紊的分布, 因此, 每个倾斜图像都有一个变异性, 它比其他图像反应得更灵敏。 我们设计了一个空间校正仪, 学会将倾斜图像的表面正常分布转换为校正性, 因为它与重力训练数据分布相匹配。 除了空间校正校正外, 我们提议一个新型的三角损失, 在较小的角差差差差差和外缘强度上提供较强的梯度梯度。 因此, 估测仪超越了状态, 方法, 包括数据扩增的立基线, 不仅包括了在扫描网格2 和移动 数据。