代号转换和代号混合语音识别 (Code Switched and Code Mixed Speech Recognition for Indic languages) - 专知论文

会员服务 ·

0

语音识别 · 代码 · INFORMS · Performer · MoDELS ·

2022 年 6 月 13 日

Code Switched and Code Mixed Speech Recognition for Indic languages

翻译：代号转换和代号混合语音识别

Harveen Singh Chadha,Priyanshi Shah,Ankur Dhuriya,Neeraj Chhimwal,Anirudh Gupta,Vivek Raghavan

from arxiv, This paper for submitted to Interspeech 2022

Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific. Training multilingual system for Indic languages is even more tougher due to lack of open source datasets and results on different approaches. We compare the performance of end to end multilingual speech recognition system to the performance of monolingual models conditioned on language identification (LID). The decoding information from a multilingual model is used for language identification and then combined with monolingual models to get an improvement of 50% WER across languages. We also propose a similar technique to solve the Code Switched problem and achieve a WER of 21.77 and 28.27 over Hindi-English and Bengali-English respectively. Our work talks on how transformer based ASR especially wav2vec 2.0 can be applied in developing multilingual ASR and code switched ASR for Indic languages.

翻译：多语言自动语音识别培训系统具有挑战性,因为语音和词汇信息通常是语言专用的。由于缺乏开放源数据集和不同方法的结果,印度语的多语言培训系统更加困难。我们把结束多语言语音识别系统的绩效与以语言识别为条件的单一语言模式的绩效进行比较。多语言模式的解码信息用于语言识别,然后与单一语言模式相结合,使不同语言的WER提高50%。我们还提议了一种类似技术,以解决《守则》转换的问题,并实现对印地语英语和孟加拉语英语的WER21.77和28.27。我们的工作讲座是如何在开发多语言的ASR特别是wav2vec 2.0中应用变压器,将ASR代码转换为印度语的代码。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

免标记的SERS微阵列芯片对病原菌的高通量快速检测

国家自然科学基金

0+阅读 · 2015年12月31日

构建HFMD-EV71神经损害MR特异性分子探针的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁性石墨烯-层状铁氧化物复合材料构建及对卤代有机物还原降解的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基因多态位点预测云南非小细胞肺癌预后风险研究

国家自然科学基金

1+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于4G-OFDM体制的GEO卫星移动通信系统星载交换关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于感知失真度量的高效视频编码率失真优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于microRNA调控叶酸对转基因AD小鼠神经干细胞作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Helmholtz散射问题的谱元法

国家自然科学基金

0+阅读 · 2011年12月31日

基于盲源分离的扩谱通信抗干扰方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

Arxiv

0+阅读 · 2022年8月3日

Performance Disparities Between Accents in Automatic Speech Recognition

Arxiv

0+阅读 · 2022年8月1日

DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition

Arxiv

0+阅读 · 2022年8月1日

Efficient Long-Text Understanding with Short-Text Models

Arxiv

0+阅读 · 2022年8月1日

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Arxiv

0+阅读 · 2022年7月29日

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition

Arxiv

0+阅读 · 2022年7月29日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

VIP会员

文章信息

相关主题

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

用于无人机的C波段空地通信系统研究 | 2025最新116页

甚高频军事战术通信系统传播性能分析研究

军事通信系统：安全行动的支柱

卫星与地面通信系统：美陆军面临的空间与电子战局势 | 39页报告

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

Arxiv

0+阅读 · 2022年8月3日

Performance Disparities Between Accents in Automatic Speech Recognition

Arxiv

0+阅读 · 2022年8月1日

DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition

Arxiv

0+阅读 · 2022年8月1日

Efficient Long-Text Understanding with Short-Text Models

Arxiv

0+阅读 · 2022年8月1日

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Arxiv

0+阅读 · 2022年7月29日

Pronunciation-aware unique character encoding for RNN Transducer-based Mandarin speech recognition

Arxiv

0+阅读 · 2022年7月29日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

相关基金

免标记的SERS微阵列芯片对病原菌的高通量快速检测

国家自然科学基金

0+阅读 · 2015年12月31日

构建HFMD-EV71神经损害MR特异性分子探针的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁性石墨烯-层状铁氧化物复合材料构建及对卤代有机物还原降解的性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

基因多态位点预测云南非小细胞肺癌预后风险研究

国家自然科学基金

1+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于4G-OFDM体制的GEO卫星移动通信系统星载交换关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于感知失真度量的高效视频编码率失真优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于microRNA调控叶酸对转基因AD小鼠神经干细胞作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Helmholtz散射问题的谱元法

国家自然科学基金

0+阅读 · 2011年12月31日

基于盲源分离的扩谱通信抗干扰方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员