与深语义群集一起不受监督的时空视频定位 (Unsupervised Temporal Video Grounding with Deep Semantic Clustering) - 专知论文

会员服务 ·

0

Pair · 簇 · 无监督 · 情景 · MINE ·

2022 年 1 月 14 日

Unsupervised Temporal Video Grounding with Deep Semantic Clustering

翻译：与深语义群集一起不受监督的时空视频定位

Daizong Liu,Xiaoye Qu,Yinzhen Wang,Xing Di,Kai Zou,Yu Cheng,Zichuan Xu,Pan Zhou

from arxiv, Accepted by AAAI2022

Temporal video grounding (TVG) aims to localize a target segment in a video according to a given sentence query. Though respectable works have made decent achievements in this task, they severely rely on abundant video-query paired data, which is expensive and time-consuming to collect in real-world scenarios. In this paper, we explore whether a video grounding model can be learned without any paired annotations. To the best of our knowledge, this paper is the first work trying to address TVG in an unsupervised setting. Considering there is no paired supervision, we propose a novel Deep Semantic Clustering Network (DSCNet) to leverage all semantic information from the whole query set to compose the possible activity in each video for grounding. Specifically, we first develop a language semantic mining module, which extracts implicit semantic features from the whole query set. Then, these language semantic features serve as the guidance to compose the activity in video via a video-based semantic aggregation module. Finally, we utilize a foreground attention branch to filter out the redundant background activities and refine the grounding results. To validate the effectiveness of our DSCNet, we conduct experiments on both ActivityNet Captions and Charades-STA datasets. The results demonstrate that DSCNet achieves competitive performance, and even outperforms most weakly-supervised approaches.

翻译：时间视频定位( TVG) 旨在根据给定的句子查询在视频中将目标部分本地化。尽管令人敬佩的作品在这项工作中取得了体面的成就, 但是他们严重依赖大量视频拼贴配数据, 这些数据在现实世界情景中收集成本昂贵且耗时。在本文中, 我们探索是否可以在不配对的注释中学习视频定位模型。根据我们的最佳知识, 本文是试图在未经监控的设置中在视频 GV 中进行匹配的首份工作。考虑到没有配对的监管, 我们提议建立一个新颖的深语类集成网( DSCNet) 来利用来自整个查询集的所有语义信息来构建每个视频中可能的活动, 用于地面定位。具体地, 我们首先开发一个语言语义采掘模块, 从整个查询集中提取隐含的语义语义特征。然后, 这些语言的语义特征作为指南, 通过基于视频的语义聚合模块在视频中描述活动。最后, 我们利用一个地面关注分支来过滤冗余的背景活动, 并完善地格网络上的结果。

1

相关内容

Pair

两人亲密社交应用，官网： http://trypair.com/

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

专知会员服务

26+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【清华大学】利用知识增强的图神经网络进行多段推理，Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

【清华大学】利用知识增强的图神经网络进行多段推理，Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

专知会员服务

95+阅读 · 2019年11月8日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

VIP中间神经元失抑制效应在MCD痫性放电自限性受损中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

二维碳材料掺杂与修饰的第一性原理研究

国家自然科学基金

1+阅读 · 2014年12月31日

无机非金属铁电纳米材料及热释电纳米发电机研究

国家自然科学基金

0+阅读 · 2014年12月31日

云计算环境分布式存储一致性维护的关键技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

redox信号介导的6-BA调控黄瓜弱光适应性的生理与分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

VIA族元素共掺杂CoSb3基材料的制备和电热输运机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

利用连锁分析和关联分析发掘小麦抗旱优异等位基因

国家自然科学基金

0+阅读 · 2012年12月31日

Ti2AlC基材料合成热力学及高温稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

选择性视觉认知障碍动态变化的解剖神经机制

国家自然科学基金

0+阅读 · 2011年12月31日

Sound-Guided Semantic Video Generation

Arxiv

0+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Active Learning for Point Cloud Semantic Segmentation via Spatial-Structural Diversity Reasoning

Arxiv

0+阅读 · 2022年4月18日

Explore and Match: A New Paradigm for Temporal Video Grounding with Natural Language

Explore and Match: A New Paradigm for Temporal Video Grounding with Natural Language

Arxiv

0+阅读 · 2022年4月18日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Deep Semantic Role Labeling with Self-Attention

Arxiv

13+阅读 · 2017年12月5日

VIP会员

文章信息

相关主题

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

专知会员服务

26+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【清华大学】利用知识增强的图神经网络进行多段推理，Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

【清华大学】利用知识增强的图神经网络进行多段推理，Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

专知会员服务

95+阅读 · 2019年11月8日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《陆军战斗操练中的关键事件诊断》

《自适应训练辅助概念及其在空战管理员加速训练中的应用导论》最新126页

军事通信市场七大趋势概述

《抗干扰无人机蜂群行为的遗传算法方法》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

相关论文

Sound-Guided Semantic Video Generation

Arxiv

0+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Active Learning for Point Cloud Semantic Segmentation via Spatial-Structural Diversity Reasoning

Arxiv

0+阅读 · 2022年4月18日

Explore and Match: A New Paradigm for Temporal Video Grounding with Natural Language

Explore and Match: A New Paradigm for Temporal Video Grounding with Natural Language

Arxiv

0+阅读 · 2022年4月18日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Deep Semantic Role Labeling with Self-Attention

Arxiv

13+阅读 · 2017年12月5日

相关基金

VIP中间神经元失抑制效应在MCD痫性放电自限性受损中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

二维碳材料掺杂与修饰的第一性原理研究

国家自然科学基金

1+阅读 · 2014年12月31日

无机非金属铁电纳米材料及热释电纳米发电机研究

国家自然科学基金

0+阅读 · 2014年12月31日

云计算环境分布式存储一致性维护的关键技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

redox信号介导的6-BA调控黄瓜弱光适应性的生理与分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

VIA族元素共掺杂CoSb3基材料的制备和电热输运机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

利用连锁分析和关联分析发掘小麦抗旱优异等位基因

国家自然科学基金

0+阅读 · 2012年12月31日

Ti2AlC基材料合成热力学及高温稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

选择性视觉认知障碍动态变化的解剖神经机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员