从生物医学文献中提取知识图创建关系图 (Relationship extraction for knowledge graph creation from biomedical literature) - 专知论文

会员服务 ·

0

entity · 关系抽取 · T5 · 图 · 知识图谱 ·

2022 年 1 月 5 日

Relationship extraction for knowledge graph creation from biomedical literature

翻译：从生物医学文献中提取知识图创建关系图

Nikola Milosevic,Wolfgang Thielemann

from arxiv, Paper submitted to Journal of Semantic Web

Biomedical research is growing in such an exponential pace that scientists, researchers and practitioners are no more able to cope with the amount of published literature in the domain. The knowledge presented in the literature needs to be systematized in such a ways that claims and hypothesis can be easily found, accessed and validated. Knowledge graphs can provide such framework for semantic knowledge representation from literature. However, in order to build knowledge graph, it is necessary to extract knowledge in form of relationships between biomedical entities and normalize both entities and relationship types. In this paper, we present and compare few rule-based and machine learning-based (Naive Bayes, Random Forests as examples of traditional machine learning methods and T5-based model as an example of modern deep learning) methods for scalable relationship extraction from biomedical literature for the integration into the knowledge graphs. We examine how resilient are these various methods to unbalanced and fairly small datasets, showing that T5 model handles well both small datasets, due to its pre-training on large C4 dataset as well as unbalanced data. The best performing model was T5 model fine-tuned on balanced data, with reported F1-score of 0.88.

翻译：科学、研究人员和从业者无法更有能力应付这一领域已出版的大量文献,因此,生物医学研究正在以如此迅速的速度发展,科学家、研究人员和从业者无法应付这一领域的大量文献。文献中提供的知识需要以易于找到、获取和验证的方式系统化。知识图表可以为文献中的语义知识表述提供这种框架。然而,为了建立知识图,有必要以生物医学实体之间的关系和实体及关系类型的正常化的形式提取知识。本文提出并比较了少数基于规则和机器学习的(自然海湾、随机森林,作为传统机器学习方法的范例和基于T5的模型,作为现代深层学习的范例)方法,从生物医学文献中提取可扩展的关系,以便纳入知识图中。我们研究了这些各种方法对不平衡和相当小的数据集的适应性,表明T5模型处理器和小数据集都是由于对大型C4数据集的预先培训以及不平衡的数据。最佳的模型是T5模型,对平衡数据作了调整,有报告的F1核心为0.88。

0

相关内容

entity

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

专知会员服务

56+阅读 · 2020年6月2日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

随机动力系统中熵结构与热力学形式的研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR34a经Notch1通路调节COXⅡ在COPD中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

代谢型谷氨酸受体5对大鼠脑缺血损伤后神经突起生长的作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ERG介导组蛋白修饰调控CRMP4失活启动前列腺癌转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

糖尿病血管钙化的新机制：高糖诱导内皮细胞－成骨细胞转分化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

《软件学报》学术期刊

国家自然科学基金

6+阅读 · 2011年12月31日

PI-IBS中TMEM16A介导IL-4对Cajal细胞损伤的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

图的几类谱及其与图的变换的关系

国家自然科学基金

0+阅读 · 2011年12月31日

含控制器的电力系统递阶（结构化）模型研究

国家自然科学基金

0+阅读 · 2010年12月31日

LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation

Arxiv

0+阅读 · 2022年4月19日

Multi-Modal Knowledge Graph Construction and Application: A Survey

Arxiv

79+阅读 · 2022年2月11日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Knowledge Graphs

Arxiv

102+阅读 · 2020年3月4日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Few-Shot Knowledge Graph Completion

Arxiv

14+阅读 · 2019年11月26日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

专知会员服务

56+阅读 · 2020年6月2日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation

Arxiv

0+阅读 · 2022年4月19日

Multi-Modal Knowledge Graph Construction and Application: A Survey

Arxiv

79+阅读 · 2022年2月11日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Knowledge Graphs

Arxiv

102+阅读 · 2020年3月4日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Few-Shot Knowledge Graph Completion

Arxiv

14+阅读 · 2019年11月26日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

相关基金

随机动力系统中熵结构与热力学形式的研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR34a经Notch1通路调节COXⅡ在COPD中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

代谢型谷氨酸受体5对大鼠脑缺血损伤后神经突起生长的作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

ERG介导组蛋白修饰调控CRMP4失活启动前列腺癌转移的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

糖尿病血管钙化的新机制：高糖诱导内皮细胞－成骨细胞转分化的研究

国家自然科学基金

0+阅读 · 2012年12月31日

《软件学报》学术期刊

国家自然科学基金

6+阅读 · 2011年12月31日

PI-IBS中TMEM16A介导IL-4对Cajal细胞损伤的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

图的几类谱及其与图的变换的关系

国家自然科学基金

0+阅读 · 2011年12月31日

含控制器的电力系统递阶（结构化）模型研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员