诊断失常语音识别数据增强技术调查 (Investigation of Data Augmentation Techniques for Disordered Speech Recognition) - 专知论文

会员服务 ·

0

数据增强 · 语音识别 · 隐藏单元 · 情景 · 错误率 ·

2022 年 1 月 14 日

Investigation of Data Augmentation Techniques for Disordered Speech Recognition

翻译：诊断失常语音识别数据增强技术调查

Mengzhe Geng,Xurong Xie,Shansong Liu,Jianwei Yu,Shoukang Hu,Xunying Liu,Helen Meng

from arxiv, Proceedings of INTERSPEECH 2020

Disordered speech recognition is a highly challenging task. The underlying neuro-motor conditions of people with speech disorders, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of speech required for system development. This paper investigates a set of data augmentation techniques for disordered speech recognition, including vocal tract length perturbation (VTLP), tempo perturbation and speed perturbation. Both normal and disordered speech were exploited in the augmentation process. Variability among impaired speakers in both the original and augmented data was modeled using learning hidden unit contributions (LHUC) based speaker adaptive training. The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute (9.3% relative) word error rate (WER) reduction over the baseline system without data augmentation, and gave an overall WER of 26.37% on the test set containing 16 dysarthric speakers.

翻译：障碍言语识别是一项极具挑战性的任务。语言障碍患者的基本神经运动条件,往往与身体残疾同时发生,导致难以收集系统开发所需的大量语音。本文调查了一套用于障碍言识别的数据增强技术,包括声道长扰动(VTLP)、动脉扰动和速度扰动。在增强过程中,正常和无序言语都得到了利用。原始和扩充数据中的受损语者之间的易变性是利用学习的隐性单位贡献(LHUC)语言适应性培训来模拟的。最后的演讲者根据UASpeech文组和基于快速扰动(9.3%相对)的速振动(2.92%绝对)字误差率对基线系统的减幅率进行了调整,而没有数据增强,在包含16个异常言者的测试组上给出了26.37 %的总WER。

0

相关内容

数据增强

数据增强在机器学习领域多指采用一些方法（比如数据蒸馏，正负样本均衡等）来提高模型数据集的质量，增强数据。

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

家蚕中影响RNA干扰效率的因子与dsRNA相互作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于遥感的宁夏工业固体废物环境监测研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于特征的大场景地面Lidar点云配准

国家自然科学基金

1+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高分辨率遥感图像高精度快速配准技术研究

国家自然科学基金

3+阅读 · 2013年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂多组分化学体系二维测量数据的降维研究

国家自然科学基金

0+阅读 · 2011年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

基于人类视觉感知的高分辨率卫星遥感图像智能分类方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Investigating Data Variance in Evaluations of Automatic Machine Translation Metrics

Arxiv

0+阅读 · 2022年4月19日

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition

Arxiv

0+阅读 · 2022年4月19日

Core Box Image Recognition and its Improvement with a New Augmentation Technique

Arxiv

0+阅读 · 2022年4月19日

Supervised Contrastive Learning for Recommendation

Arxiv

0+阅读 · 2022年4月19日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

An Adaptive Task-Related Component Analysis Method for SSVEP recognition

Arxiv

0+阅读 · 2022年4月17日

STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation

Arxiv

0+阅读 · 2022年4月16日

Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals

Arxiv

0+阅读 · 2022年4月15日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Investigating Data Variance in Evaluations of Automatic Machine Translation Metrics

Arxiv

0+阅读 · 2022年4月19日

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition

Arxiv

0+阅读 · 2022年4月19日

Core Box Image Recognition and its Improvement with a New Augmentation Technique

Arxiv

0+阅读 · 2022年4月19日

Supervised Contrastive Learning for Recommendation

Arxiv

0+阅读 · 2022年4月19日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

An Adaptive Task-Related Component Analysis Method for SSVEP recognition

Arxiv

0+阅读 · 2022年4月17日

STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation

Arxiv

0+阅读 · 2022年4月16日

Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals

Arxiv

0+阅读 · 2022年4月15日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

家蚕中影响RNA干扰效率的因子与dsRNA相互作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于遥感的宁夏工业固体废物环境监测研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于特征的大场景地面Lidar点云配准

国家自然科学基金

1+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高分辨率遥感图像高精度快速配准技术研究

国家自然科学基金

3+阅读 · 2013年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂多组分化学体系二维测量数据的降维研究

国家自然科学基金

0+阅读 · 2011年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

基于人类视觉感知的高分辨率卫星遥感图像智能分类方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员