使用背景化 CT 损失来减少代码转换 ASR 中拼法不一致之处 (Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss) - 专知论文

会员服务 ·

0

可约的 · 语音识别 · MoDELS · 损失 · Performer ·

2021 年 6 月 22 日

Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

翻译：使用背景化 CT 损失来减少代码转换 ASR 中拼法不一致之处

Burin Naowarat,Thananchai Kongthaworn,Korrawe Karunratanakul,Sheng Hui Wu,Ekapol Chuangsuwanich

from arxiv, ICASSP 2021

Code-Switching (CS) remains a challenge for Automatic Speech Recognition (ASR), especially character-based models. With the combined choice of characters from multiple languages, the outcome from character-based models suffers from phoneme duplication, resulting in language-inconsistent spellings. We propose Contextualized Connectionist Temporal Classification (CCTC) loss to encourage spelling consistencies of a character-based non-autoregressive ASR which allows for faster inference. The CCTC loss conditions the main prediction on the predicted contexts to ensure language consistency in the spellings. In contrast to existing CTC-based approaches, CCTC loss does not require frame-level alignments, since the context ground truth is obtained from the model's estimated path. Compared to the same model trained with regular CTC loss, our method consistently improved the ASR performance on both CS and monolingual corpora.

翻译：代码转换(CS)仍然是自动语音识别(ASR)的一个挑战,特别是基于性格的模型。由于从多种语言中混合选择字符,基于性格模型的结果有电话重复,导致语言拼法不一致。我们提议了背景化连接时间分类(CCT)损失,以鼓励基于性格的非侵略性ASR的拼写组合,从而能够更快地推断。CCTC损失是预测背景的主要预测条件,以确保拼法的语言一致性。与现有的基于CTC的方法相比,CCT损失不需要框架水平的校准,因为背景地面真相来自模型的估计路径。与经过常规CTC损失培训的相同模式相比,我们的方法始终改进了ACS和单语体的ASR性能。

0

相关内容

可约的

近期必读的5篇顶会CVPR 2021【行为识别】相关论文和代码

专知会员服务

59+阅读 · 2021年3月17日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

102+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

25+阅读 · 2019年11月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

18+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

186+阅读 · 2019年10月10日

已删除

将门创投

3+阅读 · 2020年8月3日

47.4mAP！最强Anchor-free目标检测网络：SAPD

47.4mAP！最强Anchor-free目标检测网络：SAPD

极市平台

13+阅读 · 2019年12月16日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

XLNet太贵？这位小哥在PyTorch Wrapper上做了个微缩版的

XLNet太贵？这位小哥在PyTorch Wrapper上做了个微缩版的

新智元

11+阅读 · 2019年7月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

图像分类论文与代码大列表

图像分类论文与代码大列表

专知

6+阅读 · 2019年2月16日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年4月6日

Teacher-Student Training for Robust Tacotron-based TTS

Teacher-Student Training for Robust Tacotron-based TTS

Arxiv

5+阅读 · 2019年11月7日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Token-level and sequence-level loss smoothing for RNN language models

Arxiv

7+阅读 · 2018年5月14日

Speech waveform synthesis from MFCC sequences with generative adversarial networks

Arxiv

5+阅读 · 2018年4月3日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

相关VIP内容

近期必读的5篇顶会CVPR 2021【行为识别】相关论文和代码

专知会员服务

59+阅读 · 2021年3月17日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

102+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

25+阅读 · 2019年11月8日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

18+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

186+阅读 · 2019年10月10日

热门VIP内容

相关资讯

已删除

将门创投

3+阅读 · 2020年8月3日

47.4mAP！最强Anchor-free目标检测网络：SAPD

47.4mAP！最强Anchor-free目标检测网络：SAPD

极市平台

13+阅读 · 2019年12月16日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

XLNet太贵？这位小哥在PyTorch Wrapper上做了个微缩版的

XLNet太贵？这位小哥在PyTorch Wrapper上做了个微缩版的

新智元

11+阅读 · 2019年7月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

图像分类论文与代码大列表

图像分类论文与代码大列表

专知

6+阅读 · 2019年2月16日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

相关论文

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年4月6日

Teacher-Student Training for Robust Tacotron-based TTS

Teacher-Student Training for Robust Tacotron-based TTS

Arxiv

5+阅读 · 2019年11月7日

The Evolved Transformer

The Evolved Transformer

Arxiv

5+阅读 · 2019年1月30日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Token-level and sequence-level loss smoothing for RNN language models

Arxiv

7+阅读 · 2018年5月14日

Speech waveform synthesis from MFCC sequences with generative adversarial networks

Arxiv

5+阅读 · 2018年4月3日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员