用于语音识别的对角状态空间增强变异器</s> (Diagonal State Space Augmented Transformers for Speech Recognition)

We improve on the popular conformer architecture by replacing the depthwise temporal convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant of linear RNNs obtained by discretizing a linear dynamical system with a diagonal state transition matrix. DSS layers project the input sequence onto a space of orthogonal polynomials where the choice of basis functions, metric and support is controlled by the eigenvalues of the transition matrix. We compare neural transducers with either conformer or our proposed DSS-augmented transformer (DSSformer) encoders on three public corpora: Switchboard English conversational telephone speech 300 hours, Switchboard+Fisher 2000 hours, and a spoken archive of holocaust survivor testimonials called MALACH 176 hours. On Switchboard 300/2000 hours, we reach a single model performance of 8.9%/6.7% WER on the combined test set of the Hub5 2000 evaluation, respectively, and on MALACH we improve the WER by 7% relative over the previous best published result. In addition, we present empirical evidence suggesting that DSS layers learn damped Fourier basis functions where the attenuation coefficients are layer specific whereas the frequency coefficients converge to almost identical linearly-spaced values across all layers.

翻译：我们改进了大众相容结构,将深度时间变化与对角状态空间(DSS)模型相取代。DSS是最近推出的线性RNNs的变体,通过对角状态过渡矩阵将线性动态系统离散而获得。DSS层将输入序列投射到正方形多元模拟空间,其中基函数、度值和支持的选择由过渡矩阵的双元值控制。我们将神经中继器与符合或拟议对角状态空间变异器(DSSexer)对三个公共公司进行对比:英语对流性电话发言交换台,300小时,开关板+Fisher 2000小时,以及全方位幸存者测试档案,称为MALACH 176小时。在300/2000开关板上,我们分别达到8.9%/6.7%的单一模型性能,在HUB5 2000联合测试集中,在MALACH变异变异变异变异器(DER)中,我们把WER值比前几个公共公司增加7%的相对值。此外,我们展示了四层正阶层的实验证据,显示整个层的比标准。</s>

相关内容

DSS

关注 464

决策支持系统（Decision Support Systems）期刊中发表的文章的共同主线是它们与支持增强决策制定的理论和技术问题的相关性。所涉及的领域可能包括基础、功能、接口、实现、影响和决策支持系统(DSS)的评估。手稿可以从不同的方法和方法学中获得，包括决策理论、经济学、计量经济学、统计学、计算机支持的协作工作、数据库管理、语言学、管理科学、数学建模、运营管理、认知科学、心理学、用户界面管理等。但是，一份侧重于对任何这些相关领域的直接贡献的手稿应提交给适合于特定领域的机构。官网地址：http://dblp.uni-trier.de/db/journals/dss/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

50+阅读 · 2022年10月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日