扩大基于区域NNN-T的语音识别系统,并进行情感和语言分类 (Extending RNN-T-based speech recognition systems with emotion and language classification) - 专知论文

会员服务 ·

0

语音识别 · state-of-the-art · 循环神经网络 · 模型评估 · Neural Networks ·

2022 年 7 月 28 日

Extending RNN-T-based speech recognition systems with emotion and language classification

翻译：扩大基于区域NNN-T的语音识别系统,并进行情感和语言分类

Zvi Kons,Hagai Aronowitz,Edmilson Morais,Matheus Damasceno,Hong-Kwang Kuo,Samuel Thomas,George Saon

from arxiv, Accepted for publication in Interspeech 2022

Speech transcription, emotion recognition, and language identification are usually considered to be three different tasks. Each one requires a different model with a different architecture and training process. We propose using a recurrent neural network transducer (RNN-T)-based speech-to-text (STT) system as a common component that can be used for emotion recognition and language identification as well as for speech recognition. Our work extends the STT system for emotion classification through minimal changes, and shows successful results on the IEMOCAP and MELD datasets. In addition, we demonstrate that by adding a lightweight component to the RNN-T module, it can also be used for language identification. In our evaluations, this new classifier demonstrates state-of-the-art accuracy for the NIST-LRE-07 dataset.

翻译：语音转录、情感识别和语言识别通常被视为三种不同的任务。每一种任务都需要不同的模式,有不同的架构和培训过程。我们提议使用一个基于神经网络的经常性语音对文本传输器(RNN-T)系统,作为可用于情感识别和语言识别以及语音识别的一个共同组成部分。我们的工作通过微小的修改扩展了STT情感分类系统,并展示了IMOC和MELD数据集的成功结果。此外,我们还证明,通过在RNN-T模块中添加一个轻量级组件,它也可以用于语言识别。在我们的评价中,这个新的分类器展示了NIST-LRE-07数据集的最新准确性。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

多维在线跨语言Calling Network建模及其在可信国家电子税务软件中的实证应用

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

藤黄酸抗B细胞非霍奇金淋巴瘤新机制- - 调控SRC-3/组蛋白乙酰化转录复合物SUMO化修饰

国家自然科学基金

0+阅读 · 2012年12月31日

调谐输出式硅MEMS陀螺确定性误差机理、建模及标定补偿方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

EEG-based Image Feature Extraction for Visual Classification using Deep Learning

Arxiv

0+阅读 · 2022年9月27日

Improving Image Clustering through Sample Ranking and Its Application to remote--sensing images

Arxiv

0+阅读 · 2022年9月26日

Mental arithmetic task classification with convolutional neural network based on spectral-temporal features from EEG

Arxiv

0+阅读 · 2022年9月26日

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Arxiv

0+阅读 · 2022年9月23日

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

VIP会员

文章信息

相关主题

state-of-the-art

循环神经网络

Neural Networks

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

美海军作战管理系统：变革战场空间的二十年

《任务与武器驱动美海军舰队设计》报告

俄罗斯“沙希德”/“天竺葵”攻击无人机

《利用动态图对网络攻击进行建模与仿真：在云安全评估中的应用》90页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

相关论文

EEG-based Image Feature Extraction for Visual Classification using Deep Learning

Arxiv

0+阅读 · 2022年9月27日

Improving Image Clustering through Sample Ranking and Its Application to remote--sensing images

Arxiv

0+阅读 · 2022年9月26日

Mental arithmetic task classification with convolutional neural network based on spectral-temporal features from EEG

Arxiv

0+阅读 · 2022年9月26日

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Arxiv

0+阅读 · 2022年9月23日

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

相关基金

多维在线跨语言Calling Network建模及其在可信国家电子税务软件中的实证应用

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

藤黄酸抗B细胞非霍奇金淋巴瘤新机制- - 调控SRC-3/组蛋白乙酰化转录复合物SUMO化修饰

国家自然科学基金

0+阅读 · 2012年12月31日

调谐输出式硅MEMS陀螺确定性误差机理、建模及标定补偿方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员