L-SpEx: 地方化目标议长 (L-SpEx: Localized Target Speaker Extraction) - 专知论文

会员服务 ·

0

自顶向下 · 有向 · 端到端 · 注意力机制 · 基准 ·

2022 年 2 月 21 日

L-SpEx: Localized Target Speaker Extraction

翻译：L-SpEx: 地方化目标议长

Meng Ge,Chenglin Xu,Longbiao Wang,Eng Siong Chng,Jianwu Dang,Haizhou Li

from arxiv, Accepted in ICASSP 2022

Speaker extraction aims to extract the target speaker's voice from a multi-talker speech mixture given an auxiliary reference utterance. Recent studies show that speaker extraction benefits from the location or direction of the target speaker. However, these studies assume that the target speaker's location is known in advance or detected by an extra visual cue, e.g., face image or video. In this paper, we propose an end-to-end localized target speaker extraction on pure speech cues, that is called L-SpEx. Specifically, we design a speaker localizer driven by the target speaker's embedding to extract the spatial features, including direction-of-arrival (DOA) of the target speaker and beamforming output. Then, the spatial cues and target speaker's embedding are both used to form a top-down auditory attention to the target speaker. Experiments on the multi-channel reverberant dataset called MC-Libri2Mix show that our L-SpEx approach significantly outperforms the baseline system.

翻译：发言人的抽取旨在从多讲者演讲混合中提取目标发言者的声音,并配以辅助参考语句。最近的研究表明,发言者从目标发言者的位置或方向抽取好处,然而,这些研究假定,目标发言者的位置是事先知道的,或者通过额外的视觉提示(如脸部图像或视频)检测到的。在本文中,我们建议对纯语音提示(即L-SpEx)进行端到端的局部目标发言者抽取。具体地说,我们设计了一个由目标发言者嵌入来提取空间特征(包括目标发言者的抵达方向和成型输出)驱动的发言者本地化器。然后,空间提示和目标发言者的嵌入都被用来形成对目标发言者的自上至下听觉关注。多频道的实验称为 MC-Libri2Mix, 显示我们的L-SpEx方法大大超出基线系统。

0

相关内容

自顶向下

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

AAAI 2020 | 姿态辅助下的多相机协作实现主动目标追踪 Pose-Assisted Multi-Camera Collaboration for Active Object Tracking

AAAI 2020 | 姿态辅助下的多相机协作实现主动目标追踪 Pose-Assisted Multi-Camera Collaboration for Active Object Tracking

专知会员服务

34+阅读 · 2020年3月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

专知

22+阅读 · 2018年5月7日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

ADSC对GATA3/T-Bet的转录调控及其在ITP中的作用机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

AphB互作蛋白的筛选及其对霍乱弧菌毒力因子调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

扬子鳄环境适应的MHC多样性

国家自然科学基金

0+阅读 · 2014年12月31日

HGF/c-Met介导COL1A2在年龄相关性黄斑变性发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

调节性B细胞在HIV-1慢性感染中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

探测肿瘤转移抑制基因TMSG-1蛋白新功能：转录因子活性及转录调节功能

国家自然科学基金

0+阅读 · 2012年12月31日

东祁连山高寒草地禾草内生细菌的生物功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

癌症相关受体EGFR、Fas、ER和AR与钙调素相互作用的晶体结构研究

国家自然科学基金

1+阅读 · 2009年12月31日

磁性原子吸附的纳米管异质结磁性和电子输运性质的理论研究

国家自然科学基金

0+阅读 · 2009年12月31日

Case-Aware Adversarial Training

Arxiv

0+阅读 · 2022年4月20日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction

Arxiv

0+阅读 · 2022年4月15日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Hierarchical Graph Capsule Network

Hierarchical Graph Capsule Network

Arxiv

20+阅读 · 2020年12月16日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

AAAI 2020 | 姿态辅助下的多相机协作实现主动目标追踪 Pose-Assisted Multi-Camera Collaboration for Active Object Tracking

AAAI 2020 | 姿态辅助下的多相机协作实现主动目标追踪 Pose-Assisted Multi-Camera Collaboration for Active Object Tracking

专知会员服务

34+阅读 · 2020年3月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

专知

22+阅读 · 2018年5月7日

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

【论文推荐】最新十篇机器翻译相关论文—自然语言推理、无监督神经机器翻译、多任务学习、局部卷积、图卷积、多语种机器翻译

专知

15+阅读 · 2018年5月1日

相关论文

Case-Aware Adversarial Training

Arxiv

0+阅读 · 2022年4月20日

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Arxiv

1+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction

Arxiv

0+阅读 · 2022年4月15日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Hierarchical Graph Capsule Network

Hierarchical Graph Capsule Network

Arxiv

20+阅读 · 2020年12月16日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

胰腺癌分泌的exosomes教育BMDCs致预转移小生境在胰腺癌肝转移作用中的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

ADSC对GATA3/T-Bet的转录调控及其在ITP中的作用机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

AphB互作蛋白的筛选及其对霍乱弧菌毒力因子调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

扬子鳄环境适应的MHC多样性

国家自然科学基金

0+阅读 · 2014年12月31日

HGF/c-Met介导COL1A2在年龄相关性黄斑变性发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

调节性B细胞在HIV-1慢性感染中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

探测肿瘤转移抑制基因TMSG-1蛋白新功能：转录因子活性及转录调节功能

国家自然科学基金

0+阅读 · 2012年12月31日

东祁连山高寒草地禾草内生细菌的生物功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

癌症相关受体EGFR、Fas、ER和AR与钙调素相互作用的晶体结构研究

国家自然科学基金

1+阅读 · 2009年12月31日

磁性原子吸附的纳米管异质结磁性和电子输运性质的理论研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员