UML:多种语言ASR通用单一语言输出层 (UML: A Universal Monolingual Output Layer for Multilingual ASR)

Word-piece models (WPMs) are commonly used subword units in state-of-the-art end-to-end automatic speech recognition (ASR) systems. For multilingual ASR, due to the differences in written scripts across languages, multilingual WPMs bring the challenges of having overly large output layers and scaling to more languages. In this work, we propose a universal monolingual output layer (UML) to address such problems. Instead of one output node for only one WPM, UML re-associates each output node with multiple WPMs, one for each language, and results in a smaller monolingual output layer shared across languages. Consequently, the UML enables to switch in the interpretation of each output node depending on the language of the input speech. Experimental results on an 11-language voice search task demonstrated the feasibility of using UML for high-quality and high-efficiency multilingual streaming ASR.

翻译：在最先进的端到端自动语音识别系统中,单字模型(WWPMs)是常用的子词单位;对于多语种自动语音识别系统(ASR),多语种自动语音识别系统(ASR)由于不同语言书面文字的差异,多语种的WPM系统带来了产出层过大和向更多语言扩展的挑战;在这项工作中,我们建议一个通用的单一语言输出层(UML)来解决这些问题;对于一个WPM系统,UML重新组合每个输出节点,每个输出节点都有一个输出节点,每个输出节点都有多个WPM(每个语言一个),结果形成一个小的单一语言输出层,各语言共享。因此,根据投入演讲的语言,多语种可以转换每个输出节的翻译。一个11种语言语音搜索任务的实验结果表明,使用UML(UM)来高质量和高效益多语种流ASR的可行性。

相关内容

UML

关注 2

统一建模语言（UML，Unified Modeling Language）是由国际软件行业组织 OMG（对象管理集团 http://omg.org）自 1997 年起研发的用于 IT 各领域建模的一套标准、通用、图形化的面向对象（OO）建模语言，对应的国际标准为 ISO/IEC 19505。UML 具有简单、直观、形象、表达力强等特点，因此不仅常用于复杂软件系统架构的建模和面向对象分析与设计（OOAD），也可用于复杂业务流程及系统需求的建模。UML 当前的最新版本为 v2.5（2015.3）。 UML 起源于 3 位著名的软件工程方法学家 Grady Booch、James Rumbaugh、Ivar Jacobson 融合、统一了他们各自原来的建模语言和方法。

构建更好的大型语言模型，附Slides与视频，Colin Raffel (UNC, Huggingface)

专知会员服务

57+阅读 · 2023年4月17日

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日