与自然语言的本地化多级2级时间相近网络 (Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language) - 专知论文

会员服务 ·

0

矩 · Networking · Single-Shot · 判别器 · 缩放 ·

2020 年 12 月 4 日

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language

翻译：与自然语言的本地化多级2级时间相近网络

Songyang Zhang,Houwen Peng,Jianlong Fu,Yijuan Lu,Jiebo Luo

We address the problem of retrieving a specific moment from an untrimmed video by natural language. It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video. Existing methods cannot tackle this challenge well since they do not fully consider the temporal contexts between temporal moments. In this paper, we model the temporal context between video moments by a set of predefined two-dimensional maps under different temporal scales. For each map, one dimension indicates the starting time of a moment and the other indicates the duration. These 2D temporal maps can cover diverse video moments with different lengths, while representing their adjacent contexts at different temporal scales. Based on the 2D temporal maps, we propose a Multi-Scale Temporal Adjacent Network (MS-2D-TAN), a single-shot framework for moment localization. It is capable of encoding the adjacent temporal contexts at each scale, while learning discriminative features for matching video moments with referring expressions. We evaluate the proposed MS-2D-TAN on three challenging benchmarks, i.e., Charades-STA, ActivityNet Captions, and TACoS, where our MS-2D-TAN outperforms the state of the art.

翻译：我们用自然语言从未剪辑的视频中找到一个特定时刻的问题。这是一个具有挑战性的问题, 因为目标时刻可能发生在未剪辑的视频中其他时间时刻的背景下。现有方法无法很好地应对这一挑战, 因为它们没有充分考虑时间间隔之间的时间背景。在本文中, 我们用不同时间尺度下一套预先定义的二维地图来模拟视频时间间隔之间的时间背景。对于每个地图, 一个维度表示一个时刻的起始时间, 另一个维度表示时间长度。这些 2D 时间地图可以覆盖不同长度的不同视频时刻, 同时在不同的时间尺度上代表其相邻环境。基于 2D 时间尺度的地图, 我们建议建立一个多层次的双层温度相邻网络( MS-2D- TAN), 这是用于时空定位的单张框架。它能够将每个尺度的相邻时间环境进行校正, 同时学习将视频时刻与表示相匹配的区别性特征。我们根据三个具有挑战性的基准, 即 Charades- STA、 actNet Captions, 以及 TACS, 在那里, 我们的 MS-2DAN 。

6

相关内容

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【报告推荐】三维及超几何处理中的几何与数据学习（Geometry and Learning from Data in 3D and Beyond - Geometric Processing ）

【报告推荐】三维及超几何处理中的几何与数据学习（Geometry and Learning from Data in 3D and Beyond - Geometric Processing ）

专知会员服务

12+阅读 · 2019年11月10日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【紫冬报告】吴毅红研究员：2017以来的2D到3D

【紫冬报告】吴毅红研究员：2017以来的2D到3D

中国科学院自动化研究所

11+阅读 · 2018年5月8日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Semi-Supervised Action Recognition with Temporal Contrastive Learning

Arxiv

1+阅读 · 2021年2月4日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

Less is More: Learning Highlight Detection from Video Duration

Less is More: Learning Highlight Detection from Video Duration

Arxiv

7+阅读 · 2019年3月3日

Deep Ordinal Hashing with Spatial Attention

Arxiv

9+阅读 · 2018年5月7日

To Create What You Tell: Generating Videos from Captions

Arxiv

3+阅读 · 2018年4月23日

Outline Objects using Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年4月20日

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

Arxiv

5+阅读 · 2018年4月3日

Quadruplet Network with One-Shot Learning for Fast Visual Object Tracking

Arxiv

10+阅读 · 2018年3月17日

Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering

Arxiv

10+阅读 · 2018年2月27日

Video Person Re-identification by Temporal Residual Learning

Arxiv

5+阅读 · 2018年2月22日

VIP会员

文章信息

相关主题

相关VIP内容

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【报告推荐】三维及超几何处理中的几何与数据学习（Geometry and Learning from Data in 3D and Beyond - Geometric Processing ）

【报告推荐】三维及超几何处理中的几何与数据学习（Geometry and Learning from Data in 3D and Beyond - Geometric Processing ）

专知会员服务

12+阅读 · 2019年11月10日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【紫冬报告】吴毅红研究员：2017以来的2D到3D

【紫冬报告】吴毅红研究员：2017以来的2D到3D

中国科学院自动化研究所

11+阅读 · 2018年5月8日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Semi-Supervised Action Recognition with Temporal Contrastive Learning

Arxiv

1+阅读 · 2021年2月4日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

Less is More: Learning Highlight Detection from Video Duration

Less is More: Learning Highlight Detection from Video Duration

Arxiv

7+阅读 · 2019年3月3日

Deep Ordinal Hashing with Spatial Attention

Arxiv

9+阅读 · 2018年5月7日

To Create What You Tell: Generating Videos from Captions

Arxiv

3+阅读 · 2018年4月23日

Outline Objects using Deep Reinforcement Learning

Arxiv

9+阅读 · 2018年4月20日

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

Arxiv

5+阅读 · 2018年4月3日

Quadruplet Network with One-Shot Learning for Fast Visual Object Tracking

Arxiv

10+阅读 · 2018年3月17日

Mono-Camera 3D Multi-Object Tracking Using Deep Learning Detections and PMBM Filtering

Arxiv

10+阅读 · 2018年2月27日

Video Person Re-identification by Temporal Residual Learning

Arxiv

5+阅读 · 2018年2月22日

微信扫码咨询专知VIP会员