受人类感知启发的百分百的言语承认 (Accented Speech Recognition Inspired by Human Perception) - 专知论文

会员服务 ·

0

Performer · 语音识别 · 可约的 · Neural Networks · Continuity ·

2021 年 4 月 9 日

Accented Speech Recognition Inspired by Human Perception

翻译：受人类感知启发的百分百的言语承认

Xiangyun Chu,Elizabeth Combs,Amber Wang,Michael Picheny

from arxiv, Submitted to INTERSPEECH 2021

While improvements have been made in automatic speech recognition performance over the last several years, machines continue to have significantly lower performance on accented speech than humans. In addition, the most significant improvements on accented speech primarily arise by overwhelming the problem with hundreds or even thousands of hours of data. Humans typically require much less data to adapt to a new accent. This paper explores methods that are inspired by human perception to evaluate possible performance improvements for recognition of accented speech, with a specific focus on recognizing speech with a novel accent relative to that of the training data. Our experiments are run on small, accessible datasets that are available to the research community. We explore four methodologies: pre-exposure to multiple accents, grapheme and phoneme-based pronunciations, dropout (to improve generalization to a novel accent), and the identification of the layers in the neural network that can specifically be associated with accent modeling. Our results indicate that methods based on human perception are promising in reducing WER and understanding how accented speech is modeled in neural networks for novel accents.

翻译：在过去几年里,自动语音识别表现有所改进,但机器在口音方面的性能比人类要低得多。此外,口音方面最显著的改进主要产生于数百小时甚至数千小时数据的巨大问题。人类通常需要的数据要少得多,以适应新的口音。本文探讨了人类感知所激发的用来评价可能的业绩改进以确认口音的方法,具体重点是承认与培训数据相对应的新口音的语音。我们的实验是在研究界可获得的小型、无障碍的数据集上进行的。我们探索了四种方法:预先接触多种口音、石墨和基于电话的预发音、辍学(改进对新口音的概括化)以及确定神经网络中与口音模型具体相关的层。我们的研究结果表明,基于人类感知的方法在减少WER和理解新口音的神经网络中如何建模方面很有希望。

0

相关内容

Performer

【AAAI2021】缓解语言模型政治偏见

专知会员服务

22+阅读 · 2021年2月6日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

【论文推荐】最新七篇聊天机器人相关论文—社交媒体、远程医疗、序列理解、食物推荐、知识学习引擎、终生交互知识学习、任务管理

【论文推荐】最新七篇聊天机器人相关论文—社交媒体、远程医疗、序列理解、食物推荐、知识学习引擎、终生交互知识学习、任务管理

专知

7+阅读 · 2018年4月10日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Attention-based Contextual Language Model Adaptation for Speech Recognition

Arxiv

0+阅读 · 2021年6月2日

Accented Speech Recognition: A Survey

Arxiv

0+阅读 · 2021年6月2日

Automatic and perceptual discrimination between dysarthria, apraxia of speech, and neurotypical speech

Arxiv

0+阅读 · 2021年6月2日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【AAAI2021】缓解语言模型政治偏见

专知会员服务

22+阅读 · 2021年2月6日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

【论文推荐】最新七篇自动问答相关论文—答案重排序、电影问答、句子间交互、用户意图、实体链接、多尺度匹配对抗训练

专知

7+阅读 · 2018年5月8日

【论文推荐】最新七篇聊天机器人相关论文—社交媒体、远程医疗、序列理解、食物推荐、知识学习引擎、终生交互知识学习、任务管理

【论文推荐】最新七篇聊天机器人相关论文—社交媒体、远程医疗、序列理解、食物推荐、知识学习引擎、终生交互知识学习、任务管理

专知

7+阅读 · 2018年4月10日

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

【论文推荐】最新7篇聊天机器人（Chatbot）相关论文—触动你的心、DeepProbe、饮食推荐、知识学习、交互、挑战、管理

专知

12+阅读 · 2018年3月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

相关论文

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

Arxiv

0+阅读 · 2021年6月3日

Attention-based Contextual Language Model Adaptation for Speech Recognition

Arxiv

0+阅读 · 2021年6月2日

Accented Speech Recognition: A Survey

Arxiv

0+阅读 · 2021年6月2日

Automatic and perceptual discrimination between dysarthria, apraxia of speech, and neurotypical speech

Arxiv

0+阅读 · 2021年6月2日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

微信扫码咨询专知VIP会员