与最佳变异培训的多讲者单一频道讲话隔离 (Many-Speakers Single Channel Speech Separation with Optimal Permutation Training) - 专知论文

会员服务 ·

0

分离的 · 置换不变性 · 优化器 · 通道 · 不变 ·

2021 年 4 月 18 日

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

翻译：与最佳变异培训的多讲者单一频道讲话隔离

Shaked Dovrat,Eliya Nachmani,Lior Wolf

Single channel speech separation has experienced great progress in the last few years. However, training neural speech separation for a large number of speakers (e.g., more than 10 speakers) is out of reach for the current methods, which rely on the Permutation Invariant Loss (PIT). In this work, we present a permutation invariant training that employs the Hungarian algorithm in order to train with an $O(C^3)$ time complexity, where $C$ is the number of speakers, in comparison to $O(C!)$ of PIT based methods. Furthermore, we present a modified architecture that can handle the increased number of speakers. Our approach separates up to $20$ speakers and improves the previous results for large $C$ by a wide margin.

翻译：过去几年来,单一频道的语音分离取得了巨大进展,然而,对大量发言者(例如10多个发言者)进行神经语音分离培训,目前的方法无法采用,这些方法依赖于变异性变异性损失(PIT),在这项工作中,我们提供了一种变异性培训,采用匈牙利算法,以便用3美元的时间复杂度来培训,与基于PIT方法的1美元(C)相比,其发言者数为1美元。此外,我们提出了一个经修改的结构,可以处理增加的发言者人数。我们的方法将发言者分为最多20美元,并大幅度地改进以前大额C$的成绩。

0

相关内容

分离的

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【ACMMM2020】小规模行人检测的自模拟学习

【ACMMM2020】小规模行人检测的自模拟学习

专知会员服务

15+阅读 · 2020年9月25日

斯坦福2020硬课《分布式算法与优化》

斯坦福2020硬课《分布式算法与优化》

专知会员服务

123+阅读 · 2020年5月6日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

专知会员服务

85+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

模式国重实验室21篇论文入选CVPR 2020

模式国重实验室21篇论文入选CVPR 2020

专知

30+阅读 · 2020年3月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

详解GAN的谱归一化（Spectral Normalization）

详解GAN的谱归一化（Spectral Normalization）

PaperWeekly

11+阅读 · 2019年2月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

Arxiv

0+阅读 · 2021年6月9日

Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

Arxiv

0+阅读 · 2021年6月8日

Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

Arxiv

0+阅读 · 2021年6月8日

Speedy Performance Estimation for Neural Architecture Search

Arxiv

0+阅读 · 2021年6月8日

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Arxiv

0+阅读 · 2021年6月6日

Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication

Arxiv

0+阅读 · 2021年6月5日

Orthogonal Over-Parameterized Training

Arxiv

0+阅读 · 2021年6月5日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Sample Efficient Adaptive Text-to-Speech

Arxiv

5+阅读 · 2019年1月16日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

VIP会员

文章信息

相关主题

置换不变性

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

【ACMMM2020】小规模行人检测的自模拟学习

【ACMMM2020】小规模行人检测的自模拟学习

专知会员服务

15+阅读 · 2020年9月25日

斯坦福2020硬课《分布式算法与优化》

斯坦福2020硬课《分布式算法与优化》

专知会员服务

123+阅读 · 2020年5月6日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

专知会员服务

85+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

126页ppt《AI应用（AI Agent）开发新范式》！

基于深度神经网络的视频分析中的效率优化技术综述：处理系统、算法与应用

WWW2025 | KAG：一种大模型知识增强生成框架

用于时间序列预测的扩散模型：综述

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

模式国重实验室21篇论文入选CVPR 2020

模式国重实验室21篇论文入选CVPR 2020

专知

30+阅读 · 2020年3月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

详解GAN的谱归一化（Spectral Normalization）

详解GAN的谱归一化（Spectral Normalization）

PaperWeekly

11+阅读 · 2019年2月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

Arxiv

0+阅读 · 2021年6月9日

Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

Arxiv

0+阅读 · 2021年6月8日

Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

Arxiv

0+阅读 · 2021年6月8日

Speedy Performance Estimation for Neural Architecture Search

Arxiv

0+阅读 · 2021年6月8日

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Arxiv

0+阅读 · 2021年6月6日

Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication

Arxiv

0+阅读 · 2021年6月5日

Orthogonal Over-Parameterized Training

Arxiv

0+阅读 · 2021年6月5日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Sample Efficient Adaptive Text-to-Speech

Arxiv

5+阅读 · 2019年1月16日

Unified Hypersphere Embedding for Speaker Recognition

Arxiv

5+阅读 · 2018年7月22日

微信扫码咨询专知VIP会员