与用于音乐源分离的深声波分离脱钩磁度和阶段估测 (Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation) - 专知论文

会员服务 ·

0

估计/估计量 · 分离的 · 掩码 · Performer · state-of-the-art ·

2021 年 9 月 12 日

Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation

翻译：与用于音乐源分离的深声波分离脱钩磁度和阶段估测

Qiuqiang Kong,Yin Cao,Haohe Liu,Keunwoo Choi,Yuxuan Wang

from arxiv, 6 pages

Deep neural network based methods have been successfully applied to music source separation. They typically learn a mapping from a mixture spectrogram to a set of source spectrograms, all with magnitudes only. This approach has several limitations: 1) its incorrect phase reconstruction degrades the performance, 2) it limits the magnitude of masks between 0 and 1 while we observe that 22% of time-frequency bins have ideal ratio mask values of over~1 in a popular dataset, MUSDB18, 3) its potential on very deep architectures is under-explored. Our proposed system is designed to overcome these. First, we propose to estimate phases by estimating complex ideal ratio masks (cIRMs) where we decouple the estimation of cIRMs into magnitude and phase estimations. Second, we extend the separation method to effectively allow the magnitude of the mask to be larger than 1. Finally, we propose a residual UNet architecture with up to 143 layers. Our proposed system achieves a state-of-the-art MSS result on the MUSDB18 dataset, especially, a SDR of 8.98~dB on vocals, outperforming the previous best performance of 7.24~dB. The source code is available at: https://github.com/bytedance/music_source_separation

翻译：以深神经网络为基础的方法已经成功地应用于音乐源的分离。它们通常会从混合光谱图到一组源光谱图进行绘图, 并且只有数量级。这种方法有几个限制:(1) 不正确的阶段重建会降低性能, (2) 将遮罩的尺寸限制在0到1之间, 而我们观察到22%的时间频箱在流行数据集中的理想比例掩码值超过~1, MUSDB18, 3) 它在非常深层的建筑中的潜力没有得到充分的探索。我们提议的系统旨在克服这些。首先, 我们提议通过估计复杂的理想比例掩码(cIRMs)来估计各个阶段, 我们把对CIRMs的估计分解为规模和阶段估计。其次, 我们扩大分离方法,以便有效地使遮罩的尺寸大于1. 最后,我们建议一个高达143层的剩余UNet结构。我们提议的系统在MUSDB18数据集上取得了一个最先进的MSSASS结果, 特别是8. 98- dB 的SIDR, 以声波为声调, 将过去的最佳表现为7.24_ debismab/ abisal.

0

相关内容

估计/估计量

估计/估计量

深度学习理论，55页ppt，Preetum Nakkiran (UCSD)

深度学习理论，55页ppt，Preetum Nakkiran (UCSD)

专知会员服务

33+阅读 · 2021年10月27日

【ACM MM2020-清华】资源高效领域自适应

专知会员服务

11+阅读 · 2020年9月1日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

14+阅读 · 2020年5月19日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【论文】评估可扩展贝叶斯深度学习强大的计算机视觉的方法（Evaluating Scalable Bayesian Deep LearningMethods for Robust Computer Vision）

【论文】评估可扩展贝叶斯深度学习强大的计算机视觉的方法（Evaluating Scalable Bayesian Deep LearningMethods for Robust Computer Vision）

专知会员服务

12+阅读 · 2020年1月13日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Depth-Aware Multi-Grid Deep Homography Estimation with Contextual Correlation

Arxiv

1+阅读 · 2021年11月3日

Whole Brain Segmentation with Full Volume Neural Network

Arxiv

0+阅读 · 2021年10月29日

SimROD: A Simple Adaptation Method for Robust Object Detection

Arxiv

4+阅读 · 2021年7月28日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Object detection at 200 Frames Per Second

Arxiv

5+阅读 · 2018年5月16日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

估计/估计量

state-of-the-art

相关VIP内容

深度学习理论，55页ppt，Preetum Nakkiran (UCSD)

深度学习理论，55页ppt，Preetum Nakkiran (UCSD)

专知会员服务

33+阅读 · 2021年10月27日

【ACM MM2020-清华】资源高效领域自适应

专知会员服务

11+阅读 · 2020年9月1日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

14+阅读 · 2020年5月19日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

【论文】评估可扩展贝叶斯深度学习强大的计算机视觉的方法（Evaluating Scalable Bayesian Deep LearningMethods for Robust Computer Vision）

【论文】评估可扩展贝叶斯深度学习强大的计算机视觉的方法（Evaluating Scalable Bayesian Deep LearningMethods for Robust Computer Vision）

专知会员服务

12+阅读 · 2020年1月13日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Depth-Aware Multi-Grid Deep Homography Estimation with Contextual Correlation

Arxiv

1+阅读 · 2021年11月3日

Whole Brain Segmentation with Full Volume Neural Network

Arxiv

0+阅读 · 2021年10月29日

SimROD: A Simple Adaptation Method for Robust Object Detection

Arxiv

4+阅读 · 2021年7月28日

Human Pose Regression with Residual Log-likelihood Estimation

Arxiv

4+阅读 · 2021年7月26日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

Deep High-Resolution Representation Learning for Human Pose Estimation

Arxiv

5+阅读 · 2019年2月25日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Apple Flower Detection using Deep Convolutional Networks

Arxiv

3+阅读 · 2018年9月17日

Object detection at 200 Frames Per Second

Arxiv

5+阅读 · 2018年5月16日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

微信扫码咨询专知VIP会员