将知识图制作提升到大数据源和异源数据源 (Scaling Up Knowledge Graph Creation to Large and Heterogeneous Data Sources) - 专知论文

会员服务 ·

0

知识 (knowledge) · RDF · 划分 · 优化器 · 贪心逐层预训练 ·

2022 年 9 月 16 日

Scaling Up Knowledge Graph Creation to Large and Heterogeneous Data Sources

翻译：将知识图制作提升到大数据源和异源数据源

Enrique Iglesias,Samaneh Jozashoori,Maria-Esther Vidal

RDF knowledge graphs (KG) are powerful data structures to represent factual statements created from heterogeneous data sources. KG creation is laborious and demands data management techniques to be executed efficiently. This paper tackles the problem of the automatic generation of KG creation processes declaratively specified; it proposes techniques for planning and transforming heterogeneous data into RDF triples following mapping assertions specified in the RDF Mapping Language (RML). Given a set of mapping assertions, the planner provides an optimized execution plan by partitioning and scheduling the execution of the assertions. First, the planner assesses an optimized number of partitions considering the number of data sources, type of mapping assertions, and the associations between different assertions. After providing a list of partitions and assertions that belong to each partition, the planner determines their execution order. A greedy algorithm is implemented to generate the partitions' bushy tree execution plan. Bushy tree plans are translated into operating system commands that guide the execution of the partitions of the mapping assertions in the order indicated by the bushy tree. The proposed optimization approach is evaluated over state-of-the-art RML-compliant engines, and existing benchmarks of data sources and RML triples maps. Our experimental results suggest that the performance of the studied engines can be considerably improved, particularly in a complex setting with numerous triples maps and large data sources. As a result, engines that time out in complex cases are enabled to produce at least a portion of the KG applying the planner.

翻译：RDF 知识图形( KG) 是强大的数据结构, 代表来自不同数据源的事实陈述。 KG 创建是艰巨的, 要求高效地执行数据管理技术。本文解决了自动生成 KG 创建过程的自动生成问题, 明确指定了; 提出了在RDF 绘图语言( RML) 中指定绘图参数之后, 规划和将不同数据转换成 RDF 的三重 RDF 数据的技术。鉴于一组绘图数据, 规划员通过分隔和安排执行声明的时间, 提供了最优化的执行计划。首先, 规划员评估了最佳的分区数量, 考虑了数据源的数量、绘图声明的类型和不同主张之间的关联。在提供了属于每个分区的分区和主张的清单之后, 规划员决定了它们的执行顺序。实施贪婪的算法是为了生成分区的灌木树执行计划( RDF) 。布希的树计划被转换成操作系统命令, 用以指导在灌木树指示的顺序下执行绘图的分区。拟议的优化方法被评估了最先进的RML 引擎的发动机和现有三重数据结果, 。在复杂的引擎中, 我们的实验性数据源中可以大量地标中, 。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

【ACM UMAP 2022 】可复现推荐系统的语义感知内容表示，148页ppt

【ACM UMAP 2022 】可复现推荐系统的语义感知内容表示，148页ppt

专知会员服务

17+阅读 · 2022年7月6日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

杠板归对肝纤维化TGF-β1/Notch 信号通路负反馈作用及其谱效关系研究

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

低粘度硅酸铋熔体结构及其晶体生长研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

外加应力及含水蒸气环境中CoNiCrAlY涂层表面氧化层的生长机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

钴合金亚纳米催化相控制生长亚纳米直径的单壁碳纳米管

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

NiCrW高温合金Ni2(Cr,W)超点阵结构相变机制与热稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

CO2促使离子液体与极性物质相分离的机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Personalized Federated Learning via Heterogeneous Modular Networks

Personalized Federated Learning via Heterogeneous Modular Networks

Arxiv

0+阅读 · 2022年10月26日

Copula graphical models for heterogeneous mixed data

Arxiv

0+阅读 · 2022年10月24日

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data

Arxiv

0+阅读 · 2022年10月24日

Dalorex: A Data-Local Program Execution and Architecture for Memory-bound Applications

Arxiv

0+阅读 · 2022年10月23日

Bayesian Inverse Problems with Heterogeneous Variance

Arxiv

0+阅读 · 2022年10月20日

Automated Graph Machine Learning: Approaches, Libraries and Directions

Arxiv

20+阅读 · 2022年1月4日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

Identity-aware Graph Neural Networks

Identity-aware Graph Neural Networks

Arxiv

14+阅读 · 2021年1月25日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

VIP会员

文章信息

相关主题

知识 (knowledge)

贪心逐层预训练

相关VIP内容

【ACM UMAP 2022 】可复现推荐系统的语义感知内容表示，148页ppt

【ACM UMAP 2022 】可复现推荐系统的语义感知内容表示，148页ppt

专知会员服务

17+阅读 · 2022年7月6日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Personalized Federated Learning via Heterogeneous Modular Networks

Personalized Federated Learning via Heterogeneous Modular Networks

Arxiv

0+阅读 · 2022年10月26日

Copula graphical models for heterogeneous mixed data

Arxiv

0+阅读 · 2022年10月24日

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data

Arxiv

0+阅读 · 2022年10月24日

Dalorex: A Data-Local Program Execution and Architecture for Memory-bound Applications

Arxiv

0+阅读 · 2022年10月23日

Bayesian Inverse Problems with Heterogeneous Variance

Arxiv

0+阅读 · 2022年10月20日

Automated Graph Machine Learning: Approaches, Libraries and Directions

Arxiv

20+阅读 · 2022年1月4日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

Identity-aware Graph Neural Networks

Identity-aware Graph Neural Networks

Arxiv

14+阅读 · 2021年1月25日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

相关基金

杠板归对肝纤维化TGF-β1/Notch 信号通路负反馈作用及其谱效关系研究

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

低粘度硅酸铋熔体结构及其晶体生长研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

外加应力及含水蒸气环境中CoNiCrAlY涂层表面氧化层的生长机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

钴合金亚纳米催化相控制生长亚纳米直径的单壁碳纳米管

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

NiCrW高温合金Ni2(Cr,W)超点阵结构相变机制与热稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

CO2促使离子液体与极性物质相分离的机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员