SadTalker: 学习现实主义的3D运动节能, (SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation) - 专知论文

会员服务 ·

0

Learning · 3D · Extensibility · 讲稿 · INFORMS ·

2022 年 11 月 22 日

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

翻译：SadTalker: 学习现实主义的3D运动节能,

Wenxuan Zhang,Xiaodong Cun,Xuan Wang,Yong Zhang,Xi Shen,Yu Guo,Ying Shan,Fei Wang

from arxiv, Project page: https://sadtalker.github.io

Generating talking head videos through a face image and a piece of speech audio still contains many challenges. ie, unnatural head movement, distorted expression, and identity modification. We argue that these issues are mainly because of learning from the coupled 2D motion fields. On the other hand, explicitly using 3D information also suffers problems of stiff expression and incoherent video. We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation. To learn the realistic motion coefficients, we explicitly model the connections between audio and different types of motion coefficients individually. Precisely, we present ExpNet to learn the accurate facial expression from audio by distilling both coefficients and 3D-rendered faces. As for the head pose, we design PoseVAE via a conditional VAE to synthesize head motion in different styles. Finally, the generated 3D motion coefficients are mapped to the unsupervised 3D keypoints space of the proposed face render, and synthesize the final video. We conduct extensive experiments to show the superior of our method in terms of motion and video quality.

翻译：通过脸部图像和语音音频片段生成有声头的头部视频仍然包含许多挑战。即, 不正常的头部运动、扭曲的表达方式和身份修改。我们争论这些问题主要是从同时的 2D 运动场学习的。另一方面, 明确使用 3D 信息也存在僵硬的表达和不相容的视频问题。我们从音频和隐含的调制3DMMM 3D 动作系数( 头部、表达方式) 产生3D 动作系数( 头部、表达方式), 并隐含地调制出一个新的 3D 3D 运动面部面部面部组合。为了了解现实的动作系数, 我们明确将声音和不同类型运动系数的连接成模型。确切地说, 我们推出 ExpNet 来从音频中学习准确的面部表达方式, 通过提取系数和 3D 面部面部面部面部面部的图像。我们通过一个有条件的 VAE 设计 PoseVAE 来以不同的方式合成头部运动。最后的3D 将生成的3D 的3D 将3D 调调调的3D 调绘制成为不超强的3D, 并合成视频。我们进行了广泛的实验, 以展示了我们高制式的视频的视频的图像的图像的图像的图像的图像的图像的图像的图像的图像的图像的演示制。

0

相关内容

Learning

【MM 2021】基于单张图像的多风格说话人合成，Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis

【MM 2021】基于单张图像的多风格说话人合成，Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis

专知会员服务

6+阅读 · 2022年3月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

金合金@介孔氧化物核壳材料的制备及CO选择性催化氧化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

TRP离子通道在牵张力诱导的人胚胎干细胞分化的心肌细胞成熟中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

内质网Ca2+感受器STIM1调控糖尿病冠状动脉平滑肌细胞表型转化的机制

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路在MSCs对COPD上皮细胞修复中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

功能性遗传变异调控BARD1/BRCA1泛素化通路的机制及与儿童神经母细胞瘤的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

SDF-1/CXCR7轴在3D培养的间充质干细胞向缺血心肌迁徙中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

内皮细胞TRPV4-SKCa3耦联稳态失调在高血压血管功能稳态失调中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维培养转导Sox2诱导人颊黏膜上皮干细胞为iPSC的研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

无外源性基因iPS cells向肠细胞分化及对肠损伤的修复

国家自然科学基金

0+阅读 · 2011年12月31日

SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ Histopathology Image Synthesis

Arxiv

0+阅读 · 2023年1月24日

An Entropy-Based Model for Hierarchical Learning

Arxiv

0+阅读 · 2023年1月24日

Hybrid Quantum-Classical Generative Adversarial Network for High Resolution Image Generation

Arxiv

0+阅读 · 2023年1月20日

Online Estimation of Network Point Processes for Event Streams

Arxiv

0+阅读 · 2023年1月20日

Deep Learning for Time Series Anomaly Detection: A Survey

Arxiv

21+阅读 · 2022年11月9日

Recovering 3D Human Mesh from Monocular Images: A Survey

Arxiv

12+阅读 · 2022年3月8日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

VIP会员

文章信息

相关主题

相关VIP内容

【MM 2021】基于单张图像的多风格说话人合成，Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis

【MM 2021】基于单张图像的多风格说话人合成，Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis

专知会员服务

6+阅读 · 2022年3月22日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

SIAN: Style-Guided Instance-Adaptive Normalization for Multi-Organ Histopathology Image Synthesis

Arxiv

0+阅读 · 2023年1月24日

An Entropy-Based Model for Hierarchical Learning

Arxiv

0+阅读 · 2023年1月24日

Hybrid Quantum-Classical Generative Adversarial Network for High Resolution Image Generation

Arxiv

0+阅读 · 2023年1月20日

Online Estimation of Network Point Processes for Event Streams

Arxiv

0+阅读 · 2023年1月20日

Deep Learning for Time Series Anomaly Detection: A Survey

Arxiv

21+阅读 · 2022年11月9日

Recovering 3D Human Mesh from Monocular Images: A Survey

Arxiv

12+阅读 · 2022年3月8日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

3D Hand Shape and Pose Estimation from a Single RGB Image

3D Hand Shape and Pose Estimation from a Single RGB Image

Arxiv

17+阅读 · 2019年3月3日

相关基金

金合金@介孔氧化物核壳材料的制备及CO选择性催化氧化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

TRP离子通道在牵张力诱导的人胚胎干细胞分化的心肌细胞成熟中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

内质网Ca2+感受器STIM1调控糖尿病冠状动脉平滑肌细胞表型转化的机制

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号通路在MSCs对COPD上皮细胞修复中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

功能性遗传变异调控BARD1/BRCA1泛素化通路的机制及与儿童神经母细胞瘤的关联研究

国家自然科学基金

0+阅读 · 2013年12月31日

SDF-1/CXCR7轴在3D培养的间充质干细胞向缺血心肌迁徙中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

内皮细胞TRPV4-SKCa3耦联稳态失调在高血压血管功能稳态失调中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维培养转导Sox2诱导人颊黏膜上皮干细胞为iPSC的研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

无外源性基因iPS cells向肠细胞分化及对肠损伤的修复

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员