视听部分 (Audio-Visual Segmentation) - 专知论文

会员服务 ·

0

AVS · INTERACT · Guidance · 正则化项 · HTTPS ·

2022 年 9 月 2 日

Audio-Visual Segmentation

翻译：视听部分

Jinxing Zhou,Jianyuan Wang,Jiayi Zhang,Weixuan Sun,Jing Zhang,Stan Birchfield,Dan Guo,Lingpeng Kong,Meng Wang,Yiran Zhong

from arxiv, ECCV 2022; Correct the equation (3) and update the notation of the evaluation metrics in the last arxiv version; Code is available at https://github.com/OpenNLPLab/AVSBench

We propose to explore a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark (AVSBench), providing pixel-wise annotations for the sounding objects in audible videos. Two settings are studied with this benchmark: 1) semi-supervised audio-visual segmentation with a single sound source and 2) fully-supervised audio-visual segmentation with multiple sound sources. To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage the audio-visual mapping during training. Quantitative and qualitative experiments on the AVSBench compare our approach to several existing methods from related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench.

翻译：我们提议探讨一个新的问题,即视听部分(AVS),其目标是输出在图像框架时产生声音的物体像素级图。为了便利这一研究,我们建造了第一个视听部分基准(AVSBench),为听觉视频中的声音对象提供像素解说。根据这个基准,我们研究了两个设置:1)半监督的视听部分,有一个单一声音源,2)完全监督的视听部分,有多个声音源。为了处理AVS问题,我们提议了一种新颖的方法,使用时间性像素的视听互动模块输入音频部分,作为视觉部分过程的指导。我们还设计了一种正规化的损失,以鼓励在培训期间进行视听制图。关于AVSBench的定量和定性实验将我们的方法与相关任务中的若干现有方法进行比较,表明拟议的方法有望在音频和像素视觉部分之间搭建桥梁。《守则》可在 httpss/AVGIAB/SOSPLUPR. http://SOSPRAB/SOSPLUB.

0

相关内容

AVS

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

近期必读的5篇顶会CVPR 2021【视频理解】相关论文和代码

专知会员服务

38+阅读 · 2021年3月31日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

SOA1在AtRBOHF-ROS抗盐途径和凯氏带发育调控途径中的功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

EGCG通过Notch调节炎症的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

油菜素内酯响应基因EgrBZR在巨桉次生维管发育中的功能分析

国家自然科学基金

0+阅读 · 2013年12月31日

细粒棘球蚴感染小鼠Mo-MDSC源免疫抑制相关分子的研究

国家自然科学基金

0+阅读 · 2012年12月31日

杨树类黄酮合成途径下游关键酶基因的功能鉴定及其转录调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs在断奶仔猪抗F18大肠杆菌感染中的作用机制分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

Vif为靶点的抗HIV-1药物先导结构的发现与优化

国家自然科学基金

0+阅读 · 2009年12月31日

DegP (HtrA)的蛋白酶与分子伴侣活性之间功能转变的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

流感病毒感染相关宿主microRNA的鉴定及功能分析

国家自然科学基金

0+阅读 · 2008年12月31日

Decoupling Features in Hierarchical Propagation for Video Object Segmentation

Arxiv

0+阅读 · 2022年10月18日

An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Arxiv

0+阅读 · 2022年10月17日

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

Arxiv

0+阅读 · 2022年10月17日

Contrastive Audio-Visual Masked Autoencoder

Arxiv

0+阅读 · 2022年10月17日

Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization

Arxiv

0+阅读 · 2022年10月14日

MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation

Arxiv

0+阅读 · 2022年10月14日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

近期必读的5篇顶会CVPR 2021【视频理解】相关论文和代码

专知会员服务

38+阅读 · 2021年3月31日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

相关论文

Decoupling Features in Hierarchical Propagation for Video Object Segmentation

Arxiv

0+阅读 · 2022年10月18日

An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Arxiv

0+阅读 · 2022年10月17日

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

Arxiv

0+阅读 · 2022年10月17日

Contrastive Audio-Visual Masked Autoencoder

Arxiv

0+阅读 · 2022年10月17日

Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization

Arxiv

0+阅读 · 2022年10月14日

MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation

Arxiv

0+阅读 · 2022年10月14日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

SOA1在AtRBOHF-ROS抗盐途径和凯氏带发育调控途径中的功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

EGCG通过Notch调节炎症的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

油菜素内酯响应基因EgrBZR在巨桉次生维管发育中的功能分析

国家自然科学基金

0+阅读 · 2013年12月31日

细粒棘球蚴感染小鼠Mo-MDSC源免疫抑制相关分子的研究

国家自然科学基金

0+阅读 · 2012年12月31日

杨树类黄酮合成途径下游关键酶基因的功能鉴定及其转录调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs在断奶仔猪抗F18大肠杆菌感染中的作用机制分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

Vif为靶点的抗HIV-1药物先导结构的发现与优化

国家自然科学基金

0+阅读 · 2009年12月31日

DegP (HtrA)的蛋白酶与分子伴侣活性之间功能转变的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

流感病毒感染相关宿主microRNA的鉴定及功能分析

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员