沉默核心:通过地方化将言论分开 (The Cone of Silence: Speech Separation by Localization) - 专知论文

会员服务 ·

0

分离的 · Performer · binary · state-of-the-art · 噪声 ·

2020 年 10 月 12 日

The Cone of Silence: Speech Separation by Localization

翻译：沉默核心:通过地方化将言论分开

Teerapat Jenrungrot,Vivek Jayaram,Steve Seitz,Ira Kemelmacher-Shlizerman

from arxiv, 9 pages + references + supplementary. Oral presentation at NeurIPS 2020

Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers. At the core of our method is a deep network, in the waveform domain, which isolates sources within an angular region $\theta \pm w/2$, given an angle of interest $\theta$ and angular window size $w$. By exponentially decreasing $w$, we can perform a binary search to localize and separate all sources in logarithmic time. Our algorithm allows for an arbitrary number of potentially moving speakers at test time, including more speakers than seen during training. Experiments demonstrate state-of-the-art performance for both source separation and source localization, particularly in high levels of background noise.

翻译：鉴于对同时交谈的发言者数量不详的多式麦克风记录,我们同时对发言来源进行本地化,并将个别发言者分开。我们的方法的核心是,在波形域内,一个深度的网络,将角区域内的源隔开来,考虑到一个感兴趣的角度,即$\theta $\ pm w/2美元和角窗口大小为$w美元。通过指数下降,我们可以进行二进制搜索,在对数时将所有来源本地化和分离。我们的算法允许在测试时任意选择一些可能移动的源,包括比培训期间更多的发言者。实验显示了源分离和源本地化的最新表现,特别是在高背景噪音方面。

0

相关内容

分离的

Google最新《机器学习对偶性》报告，48页ppt

Google最新《机器学习对偶性》报告，48页ppt

专知会员服务

36+阅读 · 2020年11月29日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

基于手机系统的实时目标检测

基于手机系统的实时目标检测

计算机视觉战队

8+阅读 · 2018年12月5日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

已删除

将门创投

8+阅读 · 2017年7月21日

Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation

Arxiv

0+阅读 · 2020年12月1日

AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection

Arxiv

0+阅读 · 2020年11月30日

Audio-visual Speech Separation with Adversarially Disentangled Visual Representation

Arxiv

0+阅读 · 2020年11月29日

Class-agnostic Object Detection

Arxiv

0+阅读 · 2020年11月28日

Extended full waveform inversion in the time domain by the augmented Lagrangian method

Arxiv

0+阅读 · 2020年11月28日

Density estimation by Randomized Quasi-Monte Carlo

Arxiv

0+阅读 · 2020年11月26日

Polyhedral Friction Cone Estimator for Object Manipulation

Arxiv

1+阅读 · 2020年11月26日

Streaming end-to-end multi-talker speech recognition

Arxiv

0+阅读 · 2020年11月26日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Domain Specific Approximation for Object Detection

Arxiv

5+阅读 · 2018年10月4日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

Google最新《机器学习对偶性》报告，48页ppt

Google最新《机器学习对偶性》报告，48页ppt

专知会员服务

36+阅读 · 2020年11月29日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

基于手机系统的实时目标检测

基于手机系统的实时目标检测

计算机视觉战队

8+阅读 · 2018年12月5日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

已删除

将门创投

8+阅读 · 2017年7月21日

相关论文

Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation

Arxiv

0+阅读 · 2020年12月1日

AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection

Arxiv

0+阅读 · 2020年11月30日

Audio-visual Speech Separation with Adversarially Disentangled Visual Representation

Arxiv

0+阅读 · 2020年11月29日

Class-agnostic Object Detection

Arxiv

0+阅读 · 2020年11月28日

Extended full waveform inversion in the time domain by the augmented Lagrangian method

Arxiv

0+阅读 · 2020年11月28日

Density estimation by Randomized Quasi-Monte Carlo

Arxiv

0+阅读 · 2020年11月26日

Polyhedral Friction Cone Estimator for Object Manipulation

Arxiv

1+阅读 · 2020年11月26日

Streaming end-to-end multi-talker speech recognition

Arxiv

0+阅读 · 2020年11月26日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Domain Specific Approximation for Object Detection

Arxiv

5+阅读 · 2018年10月4日

微信扫码咨询专知VIP会员