Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective - 专知论文

会员服务 ·

0

标注 · Prompt · 相似度 · 可辨认的 · CASES ·

2023 年 6 月 7 日

Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

翻译：暂无翻译

Yingying Fan,Yu Wu,Yutian Lin,Bo Du

We focus on the weakly-supervised audio-visual video parsing task (AVVP), which aims to identify and locate all the events in audio/visual modalities. Previous works only concentrate on video-level overall label denoising across modalities, but overlook the segment-level label noise, where adjacent video segments (i.e., 1-second video clips) may contain different events. However, recognizing events in the segment is challenging because its label could be any combination of events that occur in the video. To address this issue, we consider tackling AVVP from the language perspective, since language could freely describe how various events appear in each segment beyond fixed labels. Specifically, we design language prompts to describe all cases of event appearance for each video. Then, the similarity between language prompts and segments is calculated, where the event of the most similar prompt is regarded as the segment-level label. In addition, to deal with the mislabeled segments, we propose to perform dynamic re-weighting on the unreliable segments to adjust their labels. Experiments show that our simple yet effective approach outperforms state-of-the-art methods by a large margin.

翻译：暂无翻译

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

桂皮醛干预糖尿病Hap1-Ahi1信号通路的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

前列腺癌干细胞雄激素受体甲基化在前列腺癌进展中分子机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

高阶Schwarz导数与Teichmuller空间紧化

国家自然科学基金

0+阅读 · 2012年12月31日

糖尿病状态下肾小管血管紧张素Ⅱ受体和脂联素受体二聚化及对受体信号通路的影响

国家自然科学基金

0+阅读 · 2011年12月31日

TAP基因阻遏炎性细胞因子信号通路促前列腺癌的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Audio-visual video-to-speech synthesis with synthesized input audio

Arxiv

0+阅读 · 2023年7月31日

ScribbleVC: Scribble-supervised Medical Image Segmentation with Vision-Class Embedding

Arxiv

0+阅读 · 2023年7月30日

Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

Arxiv

0+阅读 · 2023年7月28日

Cross-Modal Concept Learning and Inference for Vision-Language Models

Arxiv

0+阅读 · 2023年7月28日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Audio-visual video-to-speech synthesis with synthesized input audio

Arxiv

0+阅读 · 2023年7月31日

ScribbleVC: Scribble-supervised Medical Image Segmentation with Vision-Class Embedding

Arxiv

0+阅读 · 2023年7月30日

Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

Self-Supervised Pre-training for 3D Point Clouds via View-Specific Point-to-Image Translation

Arxiv

0+阅读 · 2023年7月28日

Cross-Modal Concept Learning and Inference for Vision-Language Models

Arxiv

0+阅读 · 2023年7月28日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

相关基金

桂皮醛干预糖尿病Hap1-Ahi1信号通路的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

前列腺癌干细胞雄激素受体甲基化在前列腺癌进展中分子机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

高阶Schwarz导数与Teichmuller空间紧化

国家自然科学基金

0+阅读 · 2012年12月31日

糖尿病状态下肾小管血管紧张素Ⅱ受体和脂联素受体二聚化及对受体信号通路的影响

国家自然科学基金

0+阅读 · 2011年12月31日

TAP基因阻遏炎性细胞因子信号通路促前列腺癌的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员