Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few squaremeters remains challenging. In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic scene entities at practical bandwidth requirements only based on light-weight scene capture with a single moving consumer-grade RGB-D camera. To this end, we present a system which is built upon a novel hybrid volumetric scene representation in terms of the combination of a voxel-based scene representation for the static contents, that not only stores the reconstructed surface geometry but also contains information about the object semantics as well as their accumulated dynamic movement over time, and a point-cloud-based representation for dynamic scene parts, where the respective separation from static parts is achieved based on semantic and instance information extracted for the input frames. With an independent yet simultaneous streaming of both static and dynamic content, where we seamlessly integrate potentially moving but currently static scene entities in the static model until they are becoming dynamic again, as well as the fusion of static and dynamic data at the remote client, our system is able to achieve VR-based live-telepresence at interactive rates. Our evaluation demonstrates the potential of our novel approach in terms of visual quality, performance, and ablation studies regarding involved design choices.
翻译:尽管以静态和动态场景实体为室规模的室内场景远程摄像系统取得了令人印象深刻的进展,但其能力仍具有挑战性。在本文中,我们的目标是仅根据轻量捕捉单一移动消费级 RGB-D 级的消费者级 RGB-D 相机,以实际带宽要求为基础,与在超出室规模的大规模环境中的静态和动态场景实体分享3D现场摄像系统的经验,仅根据轻量级和动态场景系统,实际带宽要求与静态和动态的现场实体分享3D现场经验。为此,我们展示了一种基于新颖的混合体积场景代表系统,结合了静态内容的反毒基场场景代表,不仅储存了重建后的表面几何形状,而且还包含关于天体构造及其长期累积动态动态动态变化的信息。 在动态场景部分,根据为投入框架提取的语义和实例信息,实现与静态部分的分别分离。在静态和动态图像质量方面,我们无缝地将静场景实体纳入静态模型中,从而实现动态的动态设计。