InterMPL: 带有中间CTC损失的动量伪标签 (InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss) - 专知论文

会员服务 ·

0

伪标记 · Performer · 语音识别 · 损失 · 动量 ·

2023 年 3 月 17 日

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

翻译：InterMPL: 带有中间CTC损失的动量伪标签

Yosuke Higuchi,Tetsuji Ogawa,Tetsunori Kobayashi,Shinji Watanabe

from arxiv, Accepted to ICASSP2023

This paper presents InterMPL, a semi-supervised learning method of end-to-end automatic speech recognition (ASR) that performs pseudo-labeling (PL) with intermediate supervision. Momentum PL (MPL) trains a connectionist temporal classification (CTC)-based model on unlabeled data by continuously generating pseudo-labels on the fly and improving their quality. In contrast to autoregressive formulations, such as the attention-based encoder-decoder and transducer, CTC is well suited for MPL, or PL-based semi-supervised ASR in general, owing to its simple/fast inference algorithm and robustness against generating collapsed labels. However, CTC generally yields inferior performance than the autoregressive models due to the conditional independence assumption, thereby limiting the performance of MPL. We propose to enhance MPL by introducing intermediate loss, inspired by the recent advances in CTC-based modeling. Specifically, we focus on self-conditional and hierarchical conditional CTC, that apply auxiliary CTC losses to intermediate layers such that the conditional independence assumption is explicitly relaxed. We also explore how pseudo-labels should be generated and used as supervision for intermediate losses. Experimental results in different semi-supervised settings demonstrate that the proposed approach outperforms MPL and improves an ASR model by up to a 12.1% absolute performance gain. In addition, our detailed analysis validates the importance of the intermediate loss.

翻译：本文提出了InterMPL，一种半监督的端到端自动语音识别（ASR）方法，它使用中间监督来进行伪标签（PL）训练。动量PL（MPL）通过不断生成伪标签并提高其质量，对未标记数据训练基于连接主义时间分类（CTC）的模型。与自回归公式，如基于注意力的编码器-解码器和转换器相比，由于其简单/快速的推理算法和抗生成折叠标签的鲁棒性，CTC非常适合于MPL或基于PL的半监督ASR，但是由于条件独立性假设，CTC通常产生低于自回归模型的性能，从而限制了MPL的性能。我们提出通过引入中间损失增强MPL，借鉴了CTC建模的最新发展。具体来说，我们专注于自条件和分层条件CTC，它们对中间层应用辅助CTC损失，使得条件独立性假设得到显式放松。我们还探讨了应如何生成和使用伪标签作为中间损失的监督。在不同的半监督设置中进行的实验结果表明，所提出的方法优于MPL，并将ASR模型的性能提高了最多12.1％的绝对性能增益。另外，我们的详细分析验证了中间损失的重要性。

0

相关内容

伪标记

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新五篇度量学习相关论文—无标签、三维姿态估计、主动度量学习、深度度量学习、层次度量学习与匹配

【论文推荐】最新五篇度量学习相关论文—无标签、三维姿态估计、主动度量学习、深度度量学习、层次度量学习与匹配

专知

20+阅读 · 2018年4月5日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

无限闭凸集族凸可行性问题中投影算法的线性收敛

国家自然科学基金

0+阅读 · 2015年12月31日

高强度纳米贝氏体钢的疲劳特性及失效机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

分散式线谱自适应主动控制理论及实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

某些耦合系统的控制问题

国家自然科学基金

0+阅读 · 2013年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

高效ⅤB /ⅡB族复合光催化剂分级结构的构筑及光生载流子传输机制

国家自然科学基金

0+阅读 · 2012年12月31日

Internet环境下构件的自适应组装与验证研究

国家自然科学基金

0+阅读 · 2012年12月31日

南亚高空准静止波变异特征及对我国南方冬季湿冷气候影响的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于极性单分子的量子态的制备与操控

国家自然科学基金

0+阅读 · 2011年12月31日

Exploiting Pseudo Image Captions for Multimodal Summarization

Arxiv

0+阅读 · 2023年5月9日

VCGAN: Video Colorization with Hybrid Generative Adversarial Network

Arxiv

0+阅读 · 2023年5月7日

Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis

Arxiv

0+阅读 · 2023年5月6日

Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation

Arxiv

0+阅读 · 2023年5月5日

Measuring Self-Supervised Representation Quality for Downstream Classification using Discriminative Features

Arxiv

0+阅读 · 2023年5月4日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

VIP会员

文章信息

相关主题

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

【贝叶斯深度学习：一种基于模型的可解释方法】Bayesian deep learning: A model-based interpretable approach

专知会员服务

49+阅读 · 2020年1月1日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新五篇度量学习相关论文—无标签、三维姿态估计、主动度量学习、深度度量学习、层次度量学习与匹配

【论文推荐】最新五篇度量学习相关论文—无标签、三维姿态估计、主动度量学习、深度度量学习、层次度量学习与匹配

专知

20+阅读 · 2018年4月5日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Exploiting Pseudo Image Captions for Multimodal Summarization

Arxiv

0+阅读 · 2023年5月9日

VCGAN: Video Colorization with Hybrid Generative Adversarial Network

Arxiv

0+阅读 · 2023年5月7日

Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis

Arxiv

0+阅读 · 2023年5月6日

Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation

Arxiv

0+阅读 · 2023年5月5日

Measuring Self-Supervised Representation Quality for Downstream Classification using Discriminative Features

Arxiv

0+阅读 · 2023年5月4日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

相关基金

无限闭凸集族凸可行性问题中投影算法的线性收敛

国家自然科学基金

0+阅读 · 2015年12月31日

高强度纳米贝氏体钢的疲劳特性及失效机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

分散式线谱自适应主动控制理论及实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

某些耦合系统的控制问题

国家自然科学基金

0+阅读 · 2013年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

高效ⅤB /ⅡB族复合光催化剂分级结构的构筑及光生载流子传输机制

国家自然科学基金

0+阅读 · 2012年12月31日

Internet环境下构件的自适应组装与验证研究

国家自然科学基金

0+阅读 · 2012年12月31日

南亚高空准静止波变异特征及对我国南方冬季湿冷气候影响的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于极性单分子的量子态的制备与操控

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员