使用分光图油漆校正发言中的错误预测 (Correcting Mispronunciations in Speech using Spectrogram Inpainting) - 专知论文

会员服务 ·

0

音素 · 图像修复 · Learning · 原点 · Performer ·

2022 年 6 月 30 日

Correcting Mispronunciations in Speech using Spectrogram Inpainting

翻译：使用分光图油漆校正发言中的错误预测

Talia Ben-Simon,Felix Kreuk,Faten Awwad,Jacob T. Cohen,Joseph Keshet

from arxiv, Accepted for publication at Interspeech 2022

Learning a new language involves constantly comparing speech productions with reference productions from the environment. Early in speech acquisition, children make articulatory adjustments to match their caregivers' speech. Grownup learners of a language tweak their speech to match the tutor reference. This paper proposes a method to synthetically generate correct pronunciation feedback given incorrect production. Furthermore, our aim is to generate the corrected production while maintaining the speaker's original voice. The system prompts the user to pronounce a phrase. The speech is recorded, and the samples associated with the inaccurate phoneme are masked with zeros. This waveform serves as an input to a speech generator, implemented as a deep learning inpainting system with a U-net architecture, and trained to output a reconstructed speech. The training set is composed of unimpaired proper speech examples, and the generator is trained to reconstruct the original proper speech. We evaluated the performance of our system on phoneme replacement of minimal pair words of English as well as on children with pronunciation disorders. Results suggest that human listeners slightly prefer our generated speech over a smoothed replacement of the inaccurate phoneme with a production of a different speaker.

翻译：学习新语言需要不断地将语音制作与来自环境的参考制作进行对比。在获取语音时, 儿童会做出与护理员的演讲相匹配的动脉调整。成长后学习一种语言的学习者将语言调整为与导师的参考匹配。本文提出了合成生成正确发音反馈的方法, 不正确制作了错误的发音反馈。此外, 我们的目标是在保持发言者原声的同时生成校正的制作。系统会促使用户发声。语音记录在记录中, 与不准确的电话相伴的样本用零遮盖。这个波形体可以作为语音生成器的输入器, 用 U- net 结构进行深度的修饰系统实施, 并培训其输出重塑的语音。训练组由无瑕疵的适当语音示例组成, 并训练生成者重建原有的正确语音。我们评估了我们的系统在电话机替换英语最小配音的功能以及有发音障碍的儿童的功能。结果显示, 人类听众会略地更喜欢我们生成的语音而不是用不同发言者制作的平滑的语音替换。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

74+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

162+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

87+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

55+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

146+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

91+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

24+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

抗MRSA活性rhodomyrtosone B类似物的合成和构效关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

细菌感染新型荧光共振能量转移探针的设计、合成及应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

不同弹头基团的苯胺嘧啶类EGFR激酶抑制剂的设计、合成及抗NSCLC活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同途径移植HUCB-MSCs治疗脑血管病大鼠microPET-CT评价及其治疗机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

“开关型”荧光纳米探针用于活细胞内生物分子的检测

国家自然科学基金

0+阅读 · 2012年12月31日

供应链多级库存网络的RFID使能的Push/Pull混合控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

stTRAIL-MSC靶向促进肝癌RFA过渡区残癌细胞凋亡的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

思茅藤C21甾体抗骨质疏松活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Efficient Planning in a Compact Latent Action Space

Arxiv

0+阅读 · 2022年8月22日

SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection

Arxiv

0+阅读 · 2022年8月21日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

A Domain Generalization Approach for Out-Of-Distribution 12-lead ECG Classification with Convolutional Neural Networks

Arxiv

0+阅读 · 2022年8月20日

Practitioners Perspective on Motivators of Agile in Global Software Development

Practitioners Perspective on Motivators of Agile in Global Software Development

Arxiv

0+阅读 · 2022年8月19日

A Risk-Sensitive Approach to Policy Optimization

Arxiv

0+阅读 · 2022年8月19日

Representation Learning for the Automatic Indexing of Sound Effects Libraries

Arxiv

0+阅读 · 2022年8月18日

Tree species classification from hyperspectral data using graph-regularized neural networks

Arxiv

0+阅读 · 2022年8月18日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

74+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

162+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

87+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

55+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

146+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

171+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

91+阅读 · 2019年10月10日

热门VIP内容

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

24+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Efficient Planning in a Compact Latent Action Space

Arxiv

0+阅读 · 2022年8月22日

SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection

Arxiv

0+阅读 · 2022年8月21日

Learning Speaker-specific Lip-to-Speech Generation

Arxiv

0+阅读 · 2022年8月20日

A Domain Generalization Approach for Out-Of-Distribution 12-lead ECG Classification with Convolutional Neural Networks

Arxiv

0+阅读 · 2022年8月20日

Practitioners Perspective on Motivators of Agile in Global Software Development

Practitioners Perspective on Motivators of Agile in Global Software Development

Arxiv

0+阅读 · 2022年8月19日

A Risk-Sensitive Approach to Policy Optimization

Arxiv

0+阅读 · 2022年8月19日

Representation Learning for the Automatic Indexing of Sound Effects Libraries

Arxiv

0+阅读 · 2022年8月18日

Tree species classification from hyperspectral data using graph-regularized neural networks

Arxiv

0+阅读 · 2022年8月18日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

抗MRSA活性rhodomyrtosone B类似物的合成和构效关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

细菌感染新型荧光共振能量转移探针的设计、合成及应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

不同弹头基团的苯胺嘧啶类EGFR激酶抑制剂的设计、合成及抗NSCLC活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同途径移植HUCB-MSCs治疗脑血管病大鼠microPET-CT评价及其治疗机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

“开关型”荧光纳米探针用于活细胞内生物分子的检测

国家自然科学基金

0+阅读 · 2012年12月31日

供应链多级库存网络的RFID使能的Push/Pull混合控制策略的研究

国家自然科学基金

0+阅读 · 2012年12月31日

stTRAIL-MSC靶向促进肝癌RFA过渡区残癌细胞凋亡的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

思茅藤C21甾体抗骨质疏松活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员