【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

会员服务 ·

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

2017 年 12 月 18 日 泡泡机器人SLAM 泡泡一分钟

每天一分钟，带你读遍机器人顶级会议文章

标题：3D Room Geometry Reconstruction Using Audio-Visual Sensors

作者：Hansung Kim、 Luca Remaggi、Philip JB Jackson、Filippo Maria Fazi and Adrian Hilton

来源：3dv-2017(International Conference On 3D vision)

播音员：Amy

编译：赵江龙 周平

欢迎个人转发朋友圈；其他机构或自媒体如需转载，后台留言申请授权

摘要

本文提出了一种基于长方体的门式房间气密性结合视听传感器的几何估计方法。现有的基于视觉的三维侦察方法不适用于具有父代或反射物体的场景，如窗口和镜像。

在这项工作中，我们融合多模态的感官信息，以超过纯粹的视觉重建的局限性，重新构造复杂的场景，包括透明和镜面。一个完整的场景是由360台摄像机和声学房间脉冲响应（RIR）捕获的扬声器和麦克风阵列记录紧凑。通过拍摄图像的立体匹配和从声音中估计主要声反射器位置来恢复场景的深度信息。将视听传感器的坐标系统统一成一个统一的参考框架，并从视听数据中重建平面单元。

最后，将长方体代理安装到平面上以生成完整的房间模型。实验结果表明，无论透明窗口、特征墙和光亮表面，所提出的系统都能完整地表达房间结构。

Abstract

In this paper we propose a cuboid-based air-tight indoor room geometry estimation method using combination of audio-visual sensors. Existing vision-based 3D reconstruction methods are not applicable for scenes with transparent or reﬂective objects such as windows and mirrors. In this work we fuse multi-modal sensory information to overcome the limitations of purely visual reconstruction for reconstruction of complex scenes including transparent and mirror surfaces. A full scene is captured by 360 ◦ cameras and acoustic room impulse responses (RIRs) recorded by a loudspeaker and compact microphone array. Depth information of the scene is recovered by stereo matching from the captured images and estimation of major acoustic reﬂector locations from the sound. The coordinate systems for audio-visual sensors are aligned into a uniﬁed reference frame and plane elements are reconstructed from audio-visual data. Finally cuboid proxies are ﬁtted to the planes to generate a complete room model. Experimental results show that the proposed system generates complete representations of the room structures regardless of transparent windows, featureless walls and shiny surfaces.

如果你对本文感兴趣，想要下载完整文章进行阅读，可以关注【泡泡机器人SLAM】公众号。

在【泡泡机器人SLAM】公众号（paopaorobot_slam）中回复关键字“3dv16”，即可获取本文下载链接。