Delivering immersive, 3D experiences for human communication requires a method to obtain 360 degree photo-realistic avatars of humans. To make these experiences accessible to all, only commodity hardware, like mobile phone cameras, should be necessary to capture the data needed for avatar creation. For avatars to be rendered realistically from any viewpoint, we require training images and camera poses from all angles. However, we cannot rely on there being trackable features in the foreground or background of all images for use in estimating poses, especially from the side or back of the head. To overcome this, we propose a novel landmark detector trained on synthetic data to estimate camera poses from 360 degree mobile phone videos of a human head for use in a multi-stage optimization process which creates a photo-realistic avatar. We perform validation experiments with synthetic data and showcase our method on 360 degree avatars trained from mobile phone videos.
翻译:用于人类通信的隐性、 3D 体验需要获得360 度的人类照片现实化活化体的方法。 为了让所有人能够获取这些经验, 只有商品硬件, 如移动电话相机, 才是捕捉创造动因所需的数据的必要条件。 要从任何角度现实地实现动因, 我们要求从所有角度对动因进行训练图像和摄像。 但是, 我们无法依靠所有图像的表面或背景的可追踪性能来进行估测, 特别是从头部的侧面或背面。 为了克服这一点, 我们提议建立一个具有合成数据的新颖里程碑的探测器, 以估计360度人类头部移动电话视频的相容, 以便用于多阶段优化过程, 创造出一个光现实化的动因。 我们用合成数据进行验证实验, 并展示我们用移动电话视频视频训练的360度成形体的方法 。