ASR - ASR - Aware 端到端神经分裂 (ASR-Aware End-to-end Neural Diarization) - 专知论文

会员服务 ·

0

语音识别 · MoDELS · 端到端 · 可约的 · INFORMS ·

2022 年 2 月 2 日

ASR-Aware End-to-end Neural Diarization

翻译：ASR - ASR - Aware 端到端神经分裂

Aparna Khare,Eunjung Han,Yuguang Yang,Andreas Stolcke

from arxiv, To appear in ICASSP 2022

We present a Conformer-based end-to-end neural diarization (EEND) model that uses both acoustic input and features derived from an automatic speech recognition (ASR) model. Two categories of features are explored: features derived directly from ASR output (phones, position-in-word and word boundaries) and features derived from a lexical speaker change detection model, trained by fine-tuning a pretrained BERT model on the ASR output. Three modifications to the Conformer-based EEND architecture are proposed to incorporate the features. First, ASR features are concatenated with acoustic features. Second, we propose a new attention mechanism called contextualized self-attention that utilizes ASR features to build robust speaker representations. Finally, multi-task learning is used to train the model to minimize classification loss for the ASR features along with diarization loss. Experiments on the two-speaker English conversations of Switchboard+SRE data sets show that multi-task learning with position-in-word information is the most effective way of utilizing ASR features, reducing the diarization error rate (DER) by 20% relative to the baseline.

翻译：我们提出了一个基于后端至端神经二分化(EEND)模型,该模型使用声学输入和自动语音识别(ASR)模型的特征,探索了两类特征:直接来自ASR输出的特征(电话、文中位置和字的界限)和由词汇式扬声器变化检测模型产生的特征,该模型通过对ASR输出的预先培训的BERT模型进行微调培训,对基于源头的EEND结构进行三项修改,以纳入这些特征。首先,AEND的功能与声学特征相融合。第二,我们建议建立一个称为背景化自我注意的新关注机制,利用ASR的特征构建强有力的语音演示。最后,多任务学习用于培训模型,以尽量减少ASR特征的分类损失以及分解损失。对总机+SRE数据集的双声英语对话的实验显示,用位置-文字信息学习是使用ASR特征的最有效方法,将二分解误率比基线降低20%。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

随机辛算法和多辛算法

国家自然科学基金

2+阅读 · 2014年12月31日

基于深度神经网络的噪声鲁棒性语音识别方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

超声波电机高效率非线性Hammerstein控制方法

国家自然科学基金

0+阅读 · 2013年12月31日

旋毛虫副肌球蛋白H-2d限制性Th表位的鉴定及Th-B表位肽嵌合疫苗免疫学效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息融合的机械故障概率盒诊断模型和算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Treelet变换的多时相SAR图像变化检测

国家自然科学基金

0+阅读 · 2009年12月31日

基于多特征情感信息融合的高效率e-Learning关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

Robust End-to-end Speaker Diarization with Generic Neural Clustering

Arxiv

0+阅读 · 2022年4月18日

Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning

Arxiv

1+阅读 · 2022年4月15日

Causal Disentanglement with Network Information for Debiased Recommendations

Arxiv

0+阅读 · 2022年4月14日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Multi-Pointer Co-Attention Networks for Recommendation

Arxiv

12+阅读 · 2018年1月28日

VIP会员

文章信息

相关主题

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Extracting Targeted Training Data from ASR Models, and How to Mitigate It

Arxiv

0+阅读 · 2022年4月18日

Robust End-to-end Speaker Diarization with Generic Neural Clustering

Arxiv

0+阅读 · 2022年4月18日

Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning

Arxiv

1+阅读 · 2022年4月15日

Causal Disentanglement with Network Information for Debiased Recommendations

Arxiv

0+阅读 · 2022年4月14日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Multi-Pointer Co-Attention Networks for Recommendation

Arxiv

12+阅读 · 2018年1月28日

相关基金

随机辛算法和多辛算法

国家自然科学基金

2+阅读 · 2014年12月31日

基于深度神经网络的噪声鲁棒性语音识别方法研究

国家自然科学基金

4+阅读 · 2013年12月31日

超声波电机高效率非线性Hammerstein控制方法

国家自然科学基金

0+阅读 · 2013年12月31日

旋毛虫副肌球蛋白H-2d限制性Th表位的鉴定及Th-B表位肽嵌合疫苗免疫学效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息融合的机械故障概率盒诊断模型和算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

稀土掺杂对Co基Heusler合金磁性和费米能级的调控

国家自然科学基金

0+阅读 · 2011年12月31日

细胞团显微成像与分析用高分辨率X-CT研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Treelet变换的多时相SAR图像变化检测

国家自然科学基金

0+阅读 · 2009年12月31日

基于多特征情感信息融合的高效率e-Learning关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员