UKnow：用于常识推理和视觉语言预训练的统一知识协议 (UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training) - 专知论文

会员服务 ·

0

知识 (knowledge) · 数据集 · 多峰值 · 图 · 知识图谱 ·

2023 年 3 月 21 日

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

翻译：UKnow：用于常识推理和视觉语言预训练的统一知识协议

Biao Gong,Xiaoying Xie,Yutong Feng,Yiliang Lv,Yujun Shen,Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namely, in-image, in-text, cross-image, cross-text, and image-text, and set up an efficient pipeline to help construct the multimodal knowledge graph from any data collection. Thanks to the logical information naturally contained in knowledge graph, organizing datasets under UKnow format opens up more possibilities of data usage compared to the commonly used image-text pairs. Following UKnow protocol, we collect, from public international news, a large-scale multimodal knowledge graph dataset that consists of 1,388,568 nodes (with 571,791 vision-related ones) and 3,673,817 triplets. The dataset is also annotated with rich event tags, including 11 coarse labels and 9,185 fine labels. Experiments on four benchmarks demonstrate the potential of UKnow in supporting common-sense reasoning and boosting vision-language pre-training with a single dataset, benefiting from its unified form of knowledge organization. Code, dataset, and models will be made publicly available.

翻译：本文介绍了一种统一的知识协议，称为UKnow，它从数据的视角促进了基于知识的研究。特别是聚焦于视觉和语言模式，将数据知识分为五种单位类型，即图像内，文本内，跨图像，跨文本和图像文本，并建立了一个高效的管道，可帮助从任何数据集构建多模态知识图谱。由于知识图谱自然包含的逻辑信息，将数据集组织为UKnow格式比常用的图像文本对开发出更多数据用途的可能性。根据UKnow协议，我们从公共国际新闻中收集了一个大规模的多模式知识图谱数据集，其中包含1,388,568个节点（其中571,791个与视觉相关）和3,673,817个三元组，并用丰富的事件标签进行了注释，包括11个粗标签和9,185个细标签。四项基准实验证明了UKnow在支持常识推理和通过单个数据集提高视觉语言预训练方面的潜力，从而受益于其统一的知识组织形式。代码，数据集和模型将公开提供。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

大数据文摘

0+阅读 · 2022年5月24日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

面向地理模型集成与运行的数据适配方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维层状纳米材料的谷极化与谷电子/自旋弛豫动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状硫化物热电材料电热输运性质的多重自由度协同调控

国家自然科学基金

0+阅读 · 2013年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

情感信息抽取的资源建设及关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

石墨烯构建Z型载流子转移通道的复合型光催化制氢材料

国家自然科学基金

0+阅读 · 2012年12月31日

多孔γl2O3基复合纳米强碱材料捕获CO2的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向产品几何规范的知识表示与测量认证研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于宽光谱光伏电池的能带梯度材料的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于超大环酞菁的近红外有机电致发光二极管

国家自然科学基金

0+阅读 · 2008年12月31日

FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery

Arxiv

0+阅读 · 2023年5月11日

Combo of Thinking and Observing for Outside-Knowledge VQA

Arxiv

1+阅读 · 2023年5月10日

Large Language Models Need Holistically Thought in Medical Conversational QA

Arxiv

0+阅读 · 2023年5月10日

Completeness, Recall, and Negation in Open-World Knowledge Bases: A Survey

Arxiv

0+阅读 · 2023年5月9日

Code Execution with Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月8日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

大数据文摘

0+阅读 · 2022年5月24日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

相关论文

FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery

Arxiv

0+阅读 · 2023年5月11日

Combo of Thinking and Observing for Outside-Knowledge VQA

Arxiv

1+阅读 · 2023年5月10日

Large Language Models Need Holistically Thought in Medical Conversational QA

Arxiv

0+阅读 · 2023年5月10日

Completeness, Recall, and Negation in Open-World Knowledge Bases: A Survey

Arxiv

0+阅读 · 2023年5月9日

Code Execution with Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月8日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

相关基金

面向地理模型集成与运行的数据适配方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维层状纳米材料的谷极化与谷电子/自旋弛豫动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状硫化物热电材料电热输运性质的多重自由度协同调控

国家自然科学基金

0+阅读 · 2013年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

情感信息抽取的资源建设及关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

石墨烯构建Z型载流子转移通道的复合型光催化制氢材料

国家自然科学基金

0+阅读 · 2012年12月31日

多孔γl2O3基复合纳米强碱材料捕获CO2的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向产品几何规范的知识表示与测量认证研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于宽光谱光伏电池的能带梯度材料的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于超大环酞菁的近红外有机电致发光二极管

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员