监测薄弱、监督薄弱的视听音频来源探测和分离 (Weakly-supervised Audio-visual Sound Source Detection and Separation) - 专知论文

会员服务 ·

0

分离的 · state-of-the-art · Extensibility · 学成 · 掩码 ·

2021 年 3 月 25 日

Weakly-supervised Audio-visual Sound Source Detection and Separation

翻译：监测薄弱、监督薄弱的视听音频来源探测和分离

Tanzila Rahman,Leonid Sigal

from arxiv, 4 figures, 6 pages

Learning how to localize and separate individual object sounds in the audio channel of the video is a difficult task. Current state-of-the-art methods predict audio masks from artificially mixed spectrograms, known as Mix-and-Separate framework. We propose an audio-visual co-segmentation, where the network learns both what individual objects look and sound like, from videos labeled with only object labels. Unlike other recent visually-guided audio source separation frameworks, our architecture can be learned in an end-to-end manner and requires no additional supervision or bounding box proposals. Specifically, we introduce weakly-supervised object segmentation in the context of sound separation. We also formulate spectrogram mask prediction using a set of learned mask bases, which combine using coefficients conditioned on the output of object segmentation , a design that facilitates separation. Extensive experiments on the MUSIC dataset show that our proposed approach outperforms state-of-the-art methods on visually guided sound source separation and sound denoising.

翻译：如何在视频的音频频道中将单个对象声音本地化和分离是一个困难的任务。目前最先进的方法从人工混合的光谱图(称为 Mix-and-Sparate 框架)中预测声音面罩。我们提议了一个视听共同部分, 网络从仅贴有对象标签的视频中学习单个物体的外观和声音。与其他最近的视觉引导音源分离框架不同, 我们的架构可以以端到端的方式学习, 不需要额外的监督或约束框提案。具体地说, 我们在音频分离中引入了微弱监控对象分割法。我们还使用一套学习过的遮罩基, 将基于对象分割输出的系数结合起来, 一种便于分离的设计。 MUSIC 数据集的广泛实验显示, 我们拟议的方法在视觉引导声源分离和声音分解方面超越了最先进的方法。

0

相关内容

分离的

【AAAI2021】协同挖掘:用于稀疏注释目标检测的自监督学习

【AAAI2021】协同挖掘:用于稀疏注释目标检测的自监督学习

专知会员服务

27+阅读 · 2020年12月6日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

专知会员服务

34+阅读 · 2020年4月11日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【资源】行人检测（Pedestrian Detection）论文整理

【资源】行人检测（Pedestrian Detection）论文整理

专知

17+阅读 · 2019年10月15日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

专知

17+阅读 · 2018年4月11日

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

专知

7+阅读 · 2018年3月21日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection

Arxiv

0+阅读 · 2021年6月29日

OSKDet: Towards Orientation-sensitive Keypoint Localization for Rotated Object Detection

Arxiv

0+阅读 · 2021年6月28日

Depth-Guided Camouflaged Object Detection

Arxiv

0+阅读 · 2021年6月26日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Triple-cooperative Video Shadow Detection

Arxiv

6+阅读 · 2021年3月11日

Few-shot acoustic event detection via meta-learning

Arxiv

26+阅读 · 2020年2月21日

Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from Images

Arxiv

3+阅读 · 2018年10月4日

Zero-Shot Detection

Arxiv

7+阅读 · 2018年3月19日

Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Arxiv

9+阅读 · 2018年3月13日

Collaborative Learning for Weakly Supervised Object Detection

Arxiv

9+阅读 · 2018年2月10日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【AAAI2021】协同挖掘:用于稀疏注释目标检测的自监督学习

【AAAI2021】协同挖掘:用于稀疏注释目标检测的自监督学习

专知会员服务

27+阅读 · 2020年12月6日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

专知会员服务

34+阅读 · 2020年4月11日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【资源】行人检测（Pedestrian Detection）论文整理

【资源】行人检测（Pedestrian Detection）论文整理

专知

17+阅读 · 2019年10月15日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

专知

17+阅读 · 2018年4月11日

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

专知

7+阅读 · 2018年3月21日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

相关论文

DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection

Arxiv

0+阅读 · 2021年6月29日

OSKDet: Towards Orientation-sensitive Keypoint Localization for Rotated Object Detection

Arxiv

0+阅读 · 2021年6月28日

Depth-Guided Camouflaged Object Detection

Arxiv

0+阅读 · 2021年6月26日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Triple-cooperative Video Shadow Detection

Arxiv

6+阅读 · 2021年3月11日

Few-shot acoustic event detection via meta-learning

Arxiv

26+阅读 · 2020年2月21日

Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from Images

Arxiv

3+阅读 · 2018年10月4日

Zero-Shot Detection

Arxiv

7+阅读 · 2018年3月19日

Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Arxiv

9+阅读 · 2018年3月13日

Collaborative Learning for Weakly Supervised Object Detection

Arxiv

9+阅读 · 2018年2月10日

微信扫码咨询专知VIP会员