We propose an algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. L-shape room). Our method operates directly on the panoramic image, rather than decomposing into perspective images as do recent works. Our network architecture is similar to that of RoomNet, but we show improvements due to aligning the image based on vanishing points, predicting multiple layout elements (corners, boundaries, size and translation), and fitting a constrained Manhattan layout to the resulting predictions. Our method compares well in speed and accuracy to other existing work on panoramas, achieves among the best accuracy for perspective images, and can handle both cuboid-shaped and more general Manhattan layouts.
翻译:我们提出一个算法,从一个图像中预测房间布局,该图像可泛泛泛泛全景和视觉图像、幼崽布局和更一般的布局(如L-形状室),我们的方法直接在全景图像上运行,而不是像最近的作品那样分解成视觉图像。我们的网络结构与室网相似,但我们显示改进的原因是根据消失点对图像进行对齐,预测多种布局元素(角线、边界、大小和翻译),以及将一个有限的曼哈顿布局与由此产生的预测相匹配。我们的方法在速度和准确性上优于其他现有的全景图像工作,在视觉图像的精度方面达到最佳的精确度,并能够处理幼类形状和曼哈顿更一般的布局。