大型语文模型对竞争性ASR系统的影响和分析 (Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems) - 专知论文

会员服务 ·

0

语言模型化 · 语音识别 · MoDELS · 词法分析 · state-of-the-art ·

2022 年 4 月 1 日

Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems

翻译：大型语文模型对竞争性ASR系统的影响和分析

Takuma Udagawa,Masayuki Suzuki,Gakuto Kurata,Nobuyasu Itoh,George Saon

from arxiv, Submitted to Interspeech 2022

Large-scale language models (LLMs) such as GPT-2, BERT and RoBERTa have been successfully applied to ASR N-best rescoring. However, whether or how they can benefit competitive, near state-of-the-art ASR systems remains unexplored. In this study, we incorporate LLM rescoring into one of the most competitive ASR baselines: the Conformer-Transducer model. We demonstrate that consistent improvement is achieved by the LLM's bidirectionality, pretraining, in-domain finetuning and context augmentation. Furthermore, our lexical analysis sheds light on how each of these components may be contributing to the ASR performance.

翻译：GPT-2、BERT和RBERTA等大型语言模型(LLMs)已成功地应用于ASR N最佳比对,然而,它们能否或如何使竞争受益,接近最先进的ASR系统仍未探索。在本研究中,我们将LLM的比对纳入最具竞争力的ASR基线之一:Confer-Transer模型。我们证明LLM的双向性、预培训、内部微调和背景增强取得了一致的改进。此外,我们的词汇分析还揭示了这些组成部分中每个组成部分如何为ASR的绩效做出贡献。

1

相关内容

语言模型化

语言模型化

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

移动云服务中轻量级设备隐私保护技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

高分辨率SAR图像目标认知模型及高效算法

国家自然科学基金

4+阅读 · 2013年12月31日

高分测绘卫星动态成像质量与非平稳像移的多参数耦合机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于参数化散射模型的三维SAR目标电磁特征提取方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

高分辨率SAR图像典型地物目标样本特征提取和识别研究

国家自然科学基金

2+阅读 · 2012年12月31日

Few-Shot Learning with Siamese Networks and Label Tuning

Arxiv

1+阅读 · 2022年4月20日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

Improving Rare Word Recognition with LM-aware MWER Training

Arxiv

0+阅读 · 2022年4月15日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

VIP会员

文章信息

相关主题

语言模型化

state-of-the-art

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能改善军事警察行动：当下现状探索》最新95页报告

《用于适应性、任务就绪型军用仿生机器人的合成数据管道》

面向现代武装力量的高级AI驱动军事模拟与训练软件

《军事应用中的AI：建立信任》最新报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Few-Shot Learning with Siamese Networks and Label Tuning

Arxiv

1+阅读 · 2022年4月20日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

Improving Rare Word Recognition with LM-aware MWER Training

Arxiv

0+阅读 · 2022年4月15日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

相关基金

移动云服务中轻量级设备隐私保护技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

高分辨率SAR图像目标认知模型及高效算法

国家自然科学基金

4+阅读 · 2013年12月31日

高分测绘卫星动态成像质量与非平稳像移的多参数耦合机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于参数化散射模型的三维SAR目标电磁特征提取方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

高分辨率SAR图像典型地物目标样本特征提取和识别研究

国家自然科学基金

2+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员