We consider the problem of Multi-view 3D Face Reconstruction (MVR) with weakly supervised learning that leverages a limited number of 2D face images (e.g. 3) to generate a high-quality 3D face model with very light annotation. Despite their encouraging performance, present MVR methods simply concatenate multi-view image features and pay less attention to critical areas (e.g. eye, brow, nose, and mouth). To this end, we propose a novel model called Deep Fusion MVR (DF-MVR) and design a multi-view encoding to a single decoding framework with skip connections, able to extract, integrate, and compensate deep features with attention from multi-view images. In addition, we develop a multi-view face parse network to learn, identify, and emphasize the critical common face area. Finally, though our model is trained with a few 2D images, it can reconstruct an accurate 3D model even if one single 2D image is input. We conduct extensive experiments to evaluate various multi-view 3D face reconstruction methods. Experiments on Pixel-Face and Bosphorus datasets indicate the superiority of our model. Without 3D landmarks annotation, DF-MVR achieves 5.2% and 3.0% RMSE improvements over the existing best weakly supervised MVRs respectively on Pixel-Face and Bosphorus datasets; with 3D landmarks annotation, DF-MVR attains superior performance particularly on Pixel-Face dataset, leading to 13.4% RMSE improvement over the best weakly supervised MVR model.
翻译:我们考虑的是多视 3D 面部重建(MVR) 问题,因为多视 3D 面部重建(MVR) 问题, 缺乏监管的学习, 利用数量有限的 2D 面部图像(例如 3 3) 来生成高品质的 3D 面部模型, 并用非常浅的注解。 尽管目前MVR 方法表现令人鼓舞, 只能将多视多视图像特性组合在一起, 对关键领域( 如眼睛、 浏览、 鼻子和嘴) 关注较少。 为此, 我们提议了一个叫作 " 深解码 MVR (DF- MVR) " 的新模型( DF- Nationalis ), 并设计一个多视码编码, 以跳过连接, 能够提取、整合和补偿高端的3DVVMV 的深度特征。 此外, 我们开发了一个多视面部FS- 4 模型和 BophoDR 3 数据, 特别显示我们目前最优级的PDRMIS 3 3 数据。