GEB+:通用事件边界划定、定地和检索基准 (GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval) - 专知论文

会员服务 ·

0

可理解性 · 数据集 · 幂法 · Analysis · 成对型 ·

2022 年 8 月 10 日

GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval

翻译：GEB+:通用事件边界划定、定地和检索基准

Yuxuan Wang,Difei Gao,Licheng Yu,Stan Weixian Lei,Matt Feiszli,Mike Zheng Shou

from arxiv, In Proceedings of the European Conference on Computer Vision 2022 [ECCV 2022]

Cognitive science has shown that humans perceive videos in terms of events separated by the state changes of dominant subjects. State changes trigger new events and are one of the most useful among the large amount of redundant information perceived. However, previous research focuses on the overall understanding of segments without evaluating the fine-grained status changes inside. In this paper, we introduce a new dataset called Kinetic-GEB+. The dataset consists of over 170k boundaries associated with captions describing status changes in the generic events in 12K videos. Upon this new dataset, we propose three tasks supporting the development of a more fine-grained, robust, and human-like understanding of videos through status changes. We evaluate many representative baselines in our dataset, where we also design a new TPD (Temporal-based Pairwise Difference) Modeling method for visual difference and achieve significant performance improvements. Besides, the results show there are still formidable challenges for current methods in the utilization of different granularities, representation of visual difference, and the accurate localization of status changes. Further analysis shows that our dataset can drive developing more powerful methods to understand status changes and thus improve video level comprehension. The dataset is available at https://github.com/showlab/GEB-Plus

翻译：认知科学显示,人类对视频的感知是因主导主题的状态变化而分离的事件。国家变化引发了新的事件, 并且是最有用的大量多余信息之一。然而, 先前的研究侧重于对各部分的整体理解, 而没有评估内部微细的状态变化。在本文中, 我们引入了一个新的数据集, 名为“ 动画- GEB+ ” 。数据集包含170多条界限, 与描述12K 视频中一般事件状态变化的字幕相关。在这一新数据集中, 我们提出三项任务, 支持通过状态变化对视频形成更精细、强大和人性化的理解。我们评估了我们数据集中许多具有代表性的基线, 其中我们还设计了一个新的 TPD( 以时间为基础的彩色差异) 模型, 用于视觉差异并实现显著的性能改进。此外, 结果表明, 目前使用不同微粒度、视觉差异的表示以及准确的状态变化定位。进一步的分析表明, 我们的数据设置可以推动开发更强有力的方法, 来理解状态变化, 从而改进可获取的 MAGEBs 。

0

相关内容

可理解性

【CVPR 2022】利用大规模视频转录推进高分辨率视频语言表示，Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

【CVPR 2022】利用大规模视频转录推进高分辨率视频语言表示，Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

专知会员服务

8+阅读 · 2022年3月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

基于颗粒物质力学的储层参数变化流固耦合模拟方法

国家自然科学基金

0+阅读 · 2014年12月31日

微纳结构钽基异质复合阵列的构筑、界面调控及光电化学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

阿维链霉菌氧化还原调控因子Rex的调控功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

LuxR家族蛋白调控茂原链霉菌TGase合成的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

局域磁场增强OLED发光效率的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压磁理论的钢筋混凝土结构疲劳损伤机理与寿命预测方法

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

NF-κB和Nrf2-ARE信号通路调控CdTe量子点氧化损伤作用的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

中高层大气潮汐与重力波的非线性相互作用

国家自然科学基金

0+阅读 · 2009年12月31日

单分子运动学的分子动力学计算与系统生物学方法

国家自然科学基金

0+阅读 · 2008年12月31日

Are word boundaries useful for unsupervised language learning?

Arxiv

0+阅读 · 2022年10月6日

Medical Image Retrieval via Nearest Neighbor Search on Pre-trained Image Features

Arxiv

0+阅读 · 2022年10月5日

DOTIE -- Detecting Objects through Temporal Isolation of Events using a Spiking Architecture

Arxiv

0+阅读 · 2022年10月3日

Zero-Shot Retrieval with Search Agents and Hybrid Environments

Arxiv

0+阅读 · 2022年9月30日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

64+阅读 · 2021年10月25日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】利用大规模视频转录推进高分辨率视频语言表示，Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

【CVPR 2022】利用大规模视频转录推进高分辨率视频语言表示，Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions

专知会员服务

8+阅读 · 2022年3月12日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025教程】基础模型遇见具身智能体

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

扩散模型中的缓存方法综述：迈向高效的多模态生成

【ICCV2025教程】《迈向视觉语言模型的全面推理》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

相关论文

Are word boundaries useful for unsupervised language learning?

Arxiv

0+阅读 · 2022年10月6日

Medical Image Retrieval via Nearest Neighbor Search on Pre-trained Image Features

Arxiv

0+阅读 · 2022年10月5日

DOTIE -- Detecting Objects through Temporal Isolation of Events using a Spiking Architecture

Arxiv

0+阅读 · 2022年10月3日

Zero-Shot Retrieval with Search Agents and Hybrid Environments

Arxiv

0+阅读 · 2022年9月30日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Deep Learning for UAV-based Object Detection and Tracking: A Survey

Arxiv

64+阅读 · 2021年10月25日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

相关基金

基于颗粒物质力学的储层参数变化流固耦合模拟方法

国家自然科学基金

0+阅读 · 2014年12月31日

微纳结构钽基异质复合阵列的构筑、界面调控及光电化学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

阿维链霉菌氧化还原调控因子Rex的调控功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

LuxR家族蛋白调控茂原链霉菌TGase合成的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

局域磁场增强OLED发光效率的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于压磁理论的钢筋混凝土结构疲劳损伤机理与寿命预测方法

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

NF-κB和Nrf2-ARE信号通路调控CdTe量子点氧化损伤作用的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

中高层大气潮汐与重力波的非线性相互作用

国家自然科学基金

0+阅读 · 2009年12月31日

单分子运动学的分子动力学计算与系统生物学方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员