Mu$ @%2}$SLAM:多任务、多语言语言和语言模式 (Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models)

We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages. By leveraging a quantized representation of speech as a target, Mu$^{2}$SLAM trains the speech-text models with a sequence-to-sequence masked denoising objective similar to T5 on the decoder and a masked language modeling (MLM) objective on the encoder, for both unlabeled speech and text, while utilizing the supervised tasks to improve cross-lingual and cross-modal representation alignment within the model. On CoVoST AST, Mu$^{2}$SLAM establishes a new state-of-the-art for models trained on public datasets, improving on xx-en translation over the previous best by 1.9 BLEU points and on en-xx translation by 1.1 BLEU points. On Voxpopuli ASR, our model matches the performance of an mSLAM model fine-tuned with an RNN-T decoder, despite using a relatively weaker sequence-to-sequence architecture. On text understanding tasks, our model improves by more than 6\% over mSLAM on XNLI, getting closer to the performance of mT5 models of comparable capacity on XNLI and TydiQA, paving the way towards a single model for all speech and text understanding tasks.

翻译：我们用100多种语言展示一个多语种顺序到顺序的模型Mu$%2}SLAM,这是一个多语种序列到顺序的模型,在未贴标签的语音识别、未贴标签的文本和监督下的数据方面,在未贴标签的语音识别、自动语音翻译和机器翻译(MT)方面,在未贴标签的语音识别(ASR)、自动语音翻译(AST)和跨模式翻译(MT)方面,在100多种语言中,我们用一个量化的语音表达方式作为目标,Mu$%2}SLAM对语音文本模型进行培训,用一个从顺序到顺序的隐蔽式解密目标,类似于关于解码的T5(MLMMM)目标,在未贴标签的语音和文本的编码上,在未贴标签的语音标码上,我们关于超语言和跨模式的模型的模型,比我们相对更接近的MSLISA的版本, 更接近了我们相对更精确的版本的版本,在SLIS-RA上,在更接近一个更接近一个更精确的版本的模型上,在不断改进的模型上更接近一个更精确的版本的版本的版本,在不断改进的模型上,在不断改进的版本上,在不断改进的版本的文本的文本的模型上,在改进的模型上,在不断改进的文本的操作的操作的操作的操作的进度上,在比的模型上,在比的改进的操作的操作的模型上,在较接近一个 mSLIS-MAMAMAMA。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日