We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation. In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical panoramic projection which allows for the use of the traditional CNN layers such as convolutional filters and max pooling without modification. Our evaluation of synthetic and real data shows that unsupervised learning of depth and ego-motion on cylindrical panoramic images can produce high-quality depth maps and that an increased field-of-view improves ego-motion estimation accuracy. We create two new datasets to evaluate our approach: a synthetic dataset created using the CARLA simulator, and Headcam, a novel dataset of panoramic video collected from a helmet-mounted camera while biking in an urban setting. We also apply our network to the problem of converting monocular panoramas to stereo panoramas.
翻译:我们引入了一种不受监督地从圆柱形全景视频中学习深度和自我感动的进化神经网络模型。全景深度估算是虚拟现实、3D模型建模和自主机器人导航等应用的重要技术。与以往应用进化神经网络将全景图像应用到全景图像的方法相比,我们采用了圆柱形全景投影,允许使用CNN传统层层,如圆柱形过滤器和不加修改地集合。我们对合成和真实数据的评估表明,在对圆柱形全景图像的深度和自我感知进行进化学习时不受监督地学习可以产生高质量的深度和自我感化地图,而且提高视野的外观可以提高自我感官估计的准确性。我们创建了两套新的数据集来评估我们的方法:利用CARLA模拟器创建的合成数据集,和在城市环境中搭建起时从头盔挂起的照相机中收集的全景色视频的新数据集。我们还将我们的网络应用于将视镜全景室转换成立板板的难题。