UniFLG: Unified Facial Landmark Generator from Text or Speech - 专知论文

会员服务 ·

0

语音合成 · Extensibility · state-of-the-art · Integration · 端到端 ·

2023 年 5 月 19 日

UniFLG: Unified Facial Landmark Generator from Text or Speech

翻译：暂无翻译

Kentaro Mitsui,Yukiya Hono,Kei Sawada

from arxiv, 5 pages, 2 figures, 3 tables, accepted for INTERSPEECH 2023. Audio samples: https://rinnakk.github.io/research/publications/UniFLG

Talking face generation has been extensively investigated owing to its wide applicability. The two primary frameworks used for talking face generation comprise a text-driven framework, which generates synchronized speech and talking faces from text, and a speech-driven framework, which generates talking faces from speech. To integrate these frameworks, this paper proposes a unified facial landmark generator (UniFLG). The proposed system exploits end-to-end text-to-speech not only for synthesizing speech but also for extracting a series of latent representations that are common to text and speech, and feeds it to a landmark decoder to generate facial landmarks. We demonstrate that our system achieves higher naturalness in both speech synthesis and facial landmark generation compared to the state-of-the-art text-driven method. We further demonstrate that our system can generate facial landmarks from speech of speakers without facial video data or even speech data.

翻译：暂无翻译

0

相关内容

语音合成

语音合成（Speech Synthesis），也称为文语转换（Text-to-Speech, TTS,它是将任意的输入文本转换成自然流畅的语音输出。语音合成涉及到人工智能、心理学、声学、语言学、数字信号处理、计算机科学等多个学科技术，是信息处理领域中的一项前沿技术。随着计算机技术的不断提高，语音合成技术从早期的共振峰合成,逐步发展为波形拼接合成和统计参数语音合成，再发展到混合语音合成；合成语音的质量、自然度已经得到明显提高，基本能满足一些特定场合的应用需求。目前，语音合成技术在银行、医院等的信息播报系统、汽车导航系统、自动应答呼叫中心等都有广泛应用，取得了巨大的经济效益。另外，随着智能手机、MP3、PDA 等与我们生活密切相关的媒介的大量涌现，语音合成的应用也在逐渐向娱乐、语音教学、康复治疗等领域深入。可以说语音合成正在影响着人们生活的方方面面。

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

血小板微粒活化mTOR通路介导糖尿病肾病早期肾小球内皮损伤的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于空时频多维度自适应处理的SAR干扰抑制新理论与新方法

国家自然科学基金

2+阅读 · 2014年12月31日

视网膜年龄相关性黄斑病变OCT图像的三维分割算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

数控装备运行可靠性的多源时变耦合建模与实时评估研究

国家自然科学基金

1+阅读 · 2013年12月31日

血清miR-696在肌肉-肝脏对话调节代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SOCS-3基因多态性与CHC合并胰岛素抵抗的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

盐酸小檗碱调节MTP启动子甲基化与肝脏脂肪含量的关系

国家自然科学基金

0+阅读 · 2009年12月31日

Jagged2high CD11bhigh 调节性树突状细胞防治cGVHD的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

传感器网络辅助下的认知无线电动态频谱资源管理研究

国家自然科学基金

1+阅读 · 2009年12月31日

A Simple and Effective Baseline for Attentional Generative Adversarial Networks

A Simple and Effective Baseline for Attentional Generative Adversarial Networks

Arxiv

0+阅读 · 2023年7月6日

On the Cultural Gap in Text-to-Image Generation

Arxiv

0+阅读 · 2023年7月6日

SCAT: Robust Self-supervised Contrastive Learning via Adversarial Training for Text Classification

Arxiv

0+阅读 · 2023年7月4日

PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation

Arxiv

0+阅读 · 2023年7月2日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

61+阅读 · 2021年10月25日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Generating Adversarial Examples with Adversarial Networks

Arxiv

10+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

A Simple and Effective Baseline for Attentional Generative Adversarial Networks

A Simple and Effective Baseline for Attentional Generative Adversarial Networks

Arxiv

0+阅读 · 2023年7月6日

On the Cultural Gap in Text-to-Image Generation

Arxiv

0+阅读 · 2023年7月6日

SCAT: Robust Self-supervised Contrastive Learning via Adversarial Training for Text Classification

Arxiv

0+阅读 · 2023年7月4日

PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation

Arxiv

0+阅读 · 2023年7月2日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

61+阅读 · 2021年10月25日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks

Arxiv

15+阅读 · 2019年1月15日

Generating Adversarial Examples with Adversarial Networks

Arxiv

10+阅读 · 2018年1月15日

相关基金

血小板微粒活化mTOR通路介导糖尿病肾病早期肾小球内皮损伤的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于空时频多维度自适应处理的SAR干扰抑制新理论与新方法

国家自然科学基金

2+阅读 · 2014年12月31日

视网膜年龄相关性黄斑病变OCT图像的三维分割算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

数控装备运行可靠性的多源时变耦合建模与实时评估研究

国家自然科学基金

1+阅读 · 2013年12月31日

血清miR-696在肌肉-肝脏对话调节代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SOCS-3基因多态性与CHC合并胰岛素抵抗的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

盐酸小檗碱调节MTP启动子甲基化与肝脏脂肪含量的关系

国家自然科学基金

0+阅读 · 2009年12月31日

Jagged2high CD11bhigh 调节性树突状细胞防治cGVHD的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

传感器网络辅助下的认知无线电动态频谱资源管理研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员