Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. While recent methods have shown impressive results when reconstructing meshes of objects from a single image, results often remain ambiguous as part of the object is unobserved. Moreover, existing image-based datasets for mesh reconstruction don't permit to study models which integrate temporal information. To alleviate both concerns we present SAIL-VOS 3D: a synthetic video dataset with frame-by-frame mesh annotations which extends SAIL-VOS. We also develop first baselines for reconstruction of 3D meshes from video data via temporal models. We demonstrate efficacy of the proposed baseline on SAIL-VOS 3D and Pix3D, showing that temporal information improves reconstruction quality. Resources and additional information are available at http://sailvos.web.illinois.edu.
翻译:从视频数据中提取详细的三维天体信息是全面了解场景的一个重要目标。 虽然最近的方法在从单一图像中重建天体模版时显示了令人印象深刻的结果, 但结果往往仍然模糊不清, 因为天体的一部分没有观测到。 此外, 现有的网目重建图像数据集不允许研究整合时间信息的模型。 为了缓解这两种关切, 我们提出了SAIL- VOS 3D: 一个合成视频数据集, 带有扩展SAIL- VOS的边框网格说明。 我们还开发了第一个基线, 用于通过时间模型从视频数据中重建三维米片。 我们展示了SAIL- VOS 3D 和 Pix3D 的拟议基线的有效性, 显示时间信息提高了重建质量。 资源和更多信息可在http://sailvos.web.illinois.edu查阅。