OLAP: 单声波控制波斯可控说话的一代人 (OPT: One-shot Pose-Controllable Talking Head Generation) - 专知论文

会员服务 ·

0

控制器 · Extensibility · INFORMS · 生成器网络 · 分离的 ·

2023 年 2 月 16 日

OPT: One-shot Pose-Controllable Talking Head Generation

翻译：OLAP: 单声波控制波斯可控说话的一代人

Jin Liu,Xi Wang,Xiaomeng Fu,Yesheng Chai,Cai Yu,Jiao Dai,Jizhong Han

from arxiv, Accepted by ICASSP2023

One-shot talking head generation produces lip-sync talking heads based on arbitrary audio and one source face. To guarantee the naturalness and realness, recent methods propose to achieve free pose control instead of simply editing mouth areas. However, existing methods do not preserve accurate identity of source face when generating head motions. To solve the identity mismatch problem and achieve high-quality free pose control, we present One-shot Pose-controllable Talking head generation network (OPT). Specifically, the Audio Feature Disentanglement Module separates content features from audios, eliminating the influence of speaker-specific information contained in arbitrary driving audios. Later, the mouth expression feature is extracted from the content feature and source face, during which the landmark loss is designed to enhance the accuracy of facial structure and identity preserving quality. Finally, to achieve free pose control, controllable head pose features from reference videos are fed into the Video Generator along with the expression feature and source face to generate new talking heads. Extensive quantitative and qualitative experimental results verify that OPT generates high-quality pose-controllable talking heads with no identity mismatch problem, outperforming previous SOTA methods.

翻译：为保证自然和真实性,最近的方法建议实现自由自制控制,而不是简单地编辑口腔区域。然而,现有方法在产生头部运动时并不保持源面的准确身份。为了解决身份错配问题并实现高质量的自由自制控制,我们展示了单发口音头生成网(OPT),具体地说,音频特征分解模块将内容特征与音频分离,消除任意驾驶声频中特定发言者信息的影响。后来,从内容特征和源面中提取了口语表达特征,在此期间,设计里程碑式损失是为了提高面部结构和身份保护质量的准确性。最后,为了实现自由自制控制,参考视频中的可控头部布局特征与表达特征和源面一起被注入视频发电机,以产生新的语音头部。广泛的定量和定性实验结果证实,巴勒莫制造出高质量、可控制面容、没有身份错配错问题的语音头部,比以前SOTA方法要好。

0

相关内容

控制器

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

62+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

基于多智能体演化博弈的柔性作业车间生产计划与调度集成优化研究

国家自然科学基金

3+阅读 · 2013年12月31日

Wnt/β-catenin和 Hedgehog信号通路互作在骨关节中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合光栅的表面等离激元器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于鱼眼镜头的嵌入式全景立体球视觉系统研究

国家自然科学基金

0+阅读 · 2011年12月31日

节能分批调度优化理论与方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fe-Pt系多元非晶合金制备磁性纳米多孔材料和脱合金化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

用自旋极化扫描隧道显微镜研究3d族磁性超薄膜的自旋结构

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

间充质干细胞在自体免疫性葡萄膜炎中作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

Arxiv

0+阅读 · 2023年4月6日

That's What I Said: Fully-Controllable Talking Face Generation

That's What I Said: Fully-Controllable Talking Face Generation

Arxiv

0+阅读 · 2023年4月6日

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning

Arxiv

0+阅读 · 2023年4月5日

Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

Arxiv

0+阅读 · 2023年4月4日

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

Arxiv

0+阅读 · 2023年4月3日

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Arxiv

0+阅读 · 2023年4月3日

Ego-Body Pose Estimation via Ego-Head Pose Estimation

Arxiv

0+阅读 · 2023年4月2日

A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation

Arxiv

0+阅读 · 2023年4月2日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

生成器网络

相关VIP内容

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

62+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

相关论文

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

Arxiv

0+阅读 · 2023年4月6日

That's What I Said: Fully-Controllable Talking Face Generation

That's What I Said: Fully-Controllable Talking Face Generation

Arxiv

0+阅读 · 2023年4月6日

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning

Arxiv

0+阅读 · 2023年4月5日

Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

Arxiv

0+阅读 · 2023年4月4日

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

Arxiv

0+阅读 · 2023年4月3日

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Arxiv

0+阅读 · 2023年4月3日

Ego-Body Pose Estimation via Ego-Head Pose Estimation

Arxiv

0+阅读 · 2023年4月2日

A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation

Arxiv

0+阅读 · 2023年4月2日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

相关基金

基于多智能体演化博弈的柔性作业车间生产计划与调度集成优化研究

国家自然科学基金

3+阅读 · 2013年12月31日

Wnt/β-catenin和 Hedgehog信号通路互作在骨关节中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合光栅的表面等离激元器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于鱼眼镜头的嵌入式全景立体球视觉系统研究

国家自然科学基金

0+阅读 · 2011年12月31日

节能分批调度优化理论与方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fe-Pt系多元非晶合金制备磁性纳米多孔材料和脱合金化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

用自旋极化扫描隧道显微镜研究3d族磁性超薄膜的自旋结构

国家自然科学基金

0+阅读 · 2009年12月31日

基于2D视频视觉关注度的3D重建方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

间充质干细胞在自体免疫性葡萄膜炎中作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员