用于文本视频检索的地貌-空间多式数据增强技术 (A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval) - 专知论文

会员服务 ·

0

多峰值 · 数据增强 · Performer · 语义相似度 · Attention ·

2022 年 8 月 3 日

A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval

翻译：用于文本视频检索的地貌-空间多式数据增强技术

Alex Falcon,Giuseppe Serra,Oswald Lanz

from arxiv, Accepted for presentation at 30th ACM International Conference on Multimedia (ACM MM)

Every hour, huge amounts of visual contents are posted on social media and user-generated content platforms. To find relevant videos by means of a natural language query, text-video retrieval methods have received increased attention over the past few years. Data augmentation techniques were introduced to increase the performance on unseen test examples by creating new training samples with the application of semantics-preserving techniques, such as color space or geometric transformations on images. Yet, these techniques are usually applied on raw data, leading to more resource-demanding solutions and also requiring the shareability of the raw data, which may not always be true, e.g. copyright issues with clips from movies or TV series. To address this shortcoming, we propose a multimodal data augmentation technique which works in the feature space and creates new videos and captions by mixing semantically similar samples. We experiment our solution on a large scale public dataset, EPIC-Kitchens-100, and achieve considerable improvements over a baseline method, improved state-of-the-art performance, while at the same time performing multiple ablation studies. We release code and pretrained models on Github at https://github.com/aranciokov/FSMMDA_VideoRetrieval.

翻译：每小时都在社交媒体和用户生成的内容平台上张贴大量视觉内容。为了通过自然语言查询找到相关视频,过去几年来,文本视频检索方法受到越来越多的关注。数据增强技术通过使用语义保存技术(如彩色空间或图像上的几何转换)来创建新的培训样本,以提高隐蔽测试实例的性能。然而,这些技术通常应用在原始数据上,导致更多的资源需求解决方案,并要求共享原始数据,而原始数据可能并不总是真实的,例如电影或电视系列片段的版权问题。为了解决这一缺陷,我们提议采用多式数据增强技术,在功能空间工作,通过将语义相似的样本混合来创建新的视频和字幕。我们在大规模公共数据集上实验我们的解决方案,EPIC-Kitchens-100,并在基线方法上实现相当大的改进,改进了艺术状态性能,同时进行多种反动研究。我们在 https://kovrivarub/Restraimacom上发布了关于Github_Vrievrievrievoria_DA.

0

相关内容

多峰值

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

miR-31参与去甲肾上腺素介导慢性内脏痛敏的表观调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩采样∑Δ调制的OFDM-UWB基带关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云计算的建筑全生命期BIM集成与应用关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA-1在横纹肌肉瘤中的表观遗传调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

PTBP1介导的survivinΔEx3过表达调控胶质母细胞瘤微血管增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ATF4调控脂肪细胞分化及糖脂代谢的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

nanog对牙髓干细胞增殖分化的影响及信号通路调控

国家自然科学基金

0+阅读 · 2011年12月31日

甲状腺癌中MAPK信号通路介导的表观遗传学改变及其调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

Epac在骨髓间充质干细胞向成骨/脂肪细胞分化转向中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation

Arxiv

0+阅读 · 2022年9月30日

Make-A-Video: Text-to-Video Generation without Text-Video Data

Arxiv

0+阅读 · 2022年9月29日

Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text

Arxiv

0+阅读 · 2022年9月28日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

VIP会员

文章信息

相关主题

语义相似度

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation

Arxiv

0+阅读 · 2022年9月30日

Make-A-Video: Text-to-Video Generation without Text-Video Data

Arxiv

0+阅读 · 2022年9月29日

Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text

Arxiv

0+阅读 · 2022年9月28日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

相关基金

miR-31参与去甲肾上腺素介导慢性内脏痛敏的表观调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩采样∑Δ调制的OFDM-UWB基带关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云计算的建筑全生命期BIM集成与应用关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA-1在横纹肌肉瘤中的表观遗传调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

PTBP1介导的survivinΔEx3过表达调控胶质母细胞瘤微血管增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ATF4调控脂肪细胞分化及糖脂代谢的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

nanog对牙髓干细胞增殖分化的影响及信号通路调控

国家自然科学基金

0+阅读 · 2011年12月31日

甲状腺癌中MAPK信号通路介导的表观遗传学改变及其调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

Epac在骨髓间充质干细胞向成骨/脂肪细胞分化转向中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员