CLIP-Event: 将文字和图像与事件结构连接 (CLIP-Event: Connecting Text and Images with Event Structures) - 专知论文

会员服务 ·

0

可理解性 · 事件抽取 · contrastive · ACM Multimedia · MoDELS ·

2022 年 1 月 13 日

CLIP-Event: Connecting Text and Images with Event Structures

翻译：CLIP-Event: 将文字和图像与事件结构连接

Manling Li,Ruochen Xu,Shuohang Wang,Luowei Zhou,Xudong Lin,Chenguang Zhu,Michael Zeng,Heng Ji,Shih-Fu Chang

Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text. While existing vision-language pretraining models primarily focus on understanding objects in images or entities in text, they often ignore the alignment at the level of events and their argument structures. % In this work, we propose a contrastive learning framework to enforce vision-language pretraining models to comprehend events and associated argument (participant) roles. To achieve this, we take advantage of text information extraction technologies to obtain event structural knowledge, and utilize multiple prompt functions to contrast difficult negative descriptions by manipulating event structures. We also design an event graph alignment loss based on optimal transport to capture event argument structures. In addition, we collect a large event-rich dataset (106,875 images) for pretraining, which provides a more challenging image retrieval benchmark to assess the understanding of complicated lengthy sentences. Experiments show that our zero-shot CLIP-Event outperforms the state-of-the-art supervised model in argument extraction on Multimedia Event Extraction, achieving more than 5\% absolute F-score gain in event extraction, as well as significant improvements on a variety of downstream tasks under zero-shot settings.

翻译：视觉语言(V+L)预培训模式通过理解图像和文字之间的校正,在支持多媒体应用方面取得了巨大成功; 虽然现有的视觉语言预培训模式主要侧重于理解图像或文字实体中的物体,但往往忽视事件及其论证结构层面的校正。% 在这项工作中,我们提出一个对比式学习框架,以强制实施视觉语言预培训模式,以理解事件和相关论据(参与者)的作用; 为了实现这一目标,我们利用文本信息提取技术获取事件结构知识,并利用多种快速功能来通过操纵事件结构来对比困难的负面描述; 我们还根据最佳传输来捕捉事件论证结构,设计事件图表调整损失。此外,我们收集了一个大型事件丰富数据集(106 875图像)用于预培训,为评估对复杂长句的理解提供了一个更具挑战性的图像检索基准。实验显示,我们零发的CLIP- Event 超越了在多媒体采掘活动中的州级监督模型,在事件提取中实现了超过5 ⁇ 绝对F-Corcol-hol-hol-hol-hain plain real reduction reduction reduction redustration),作为重大的下在事件提取中位任务。

0

相关内容

可理解性

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

323+阅读 · 2020年11月26日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

专知会员服务

26+阅读 · 2019年11月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

压电力显微成像中机电耦合机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习的复杂图像显著物体检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多源农业遥感数据的尺度转换

国家自然科学基金

0+阅读 · 2013年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

Ach和CGRP在前庭核传入、传出神经元直接投射通路中的调节作用

国家自然科学基金

0+阅读 · 2012年12月31日

额尔齐斯河鱼类复口吸虫种类组成及种群生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

病理性疼痛调节的新靶点-脊髓背角星形胶质细胞糖皮质激素受体

国家自然科学基金

0+阅读 · 2009年12月31日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships

Arxiv

0+阅读 · 2022年4月19日

Coarse-to-Fine Reasoning for Visual Question Answering

Arxiv

0+阅读 · 2022年4月19日

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

Arxiv

12+阅读 · 2020年8月11日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

323+阅读 · 2020年11月26日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

38+阅读 · 2020年4月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

专知会员服务

26+阅读 · 2019年11月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

相关论文

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships

Arxiv

0+阅读 · 2022年4月19日

Coarse-to-Fine Reasoning for Visual Question Answering

Arxiv

0+阅读 · 2022年4月19日

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

Arxiv

12+阅读 · 2020年8月11日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

压电力显微成像中机电耦合机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习的复杂图像显著物体检测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

多源农业遥感数据的尺度转换

国家自然科学基金

0+阅读 · 2013年12月31日

β-catenin/Ets1复合体在胶质母细胞瘤中对hTERT表达调控机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限域上多项式的p-进与T-进指数和

国家自然科学基金

0+阅读 · 2013年12月31日

Ach和CGRP在前庭核传入、传出神经元直接投射通路中的调节作用

国家自然科学基金

0+阅读 · 2012年12月31日

额尔齐斯河鱼类复口吸虫种类组成及种群生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

病理性疼痛调节的新靶点-脊髓背角星形胶质细胞糖皮质激素受体

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员