逻辑Lib示范:在可缩放数据系统上显示多语言界面 (Demonstration of LogicLib: An Expressive Multi-Language Interface over Scalable Datalog System)

With the ever-increasing volume of data, there is an urgent need to provide expressive and efficient tools to support Big Data analytics. The declarative logical language Datalog has proven very effective at expressing concisely graph, machine learning, and knowledge discovery applications via recursive queries. In this demonstration, we develop Logic Library (LLib), a library of recursive algorithms written in Datalog that can be executed in BigDatalog, a Datalog engine on top of Apache Spark developed by us. LLib encapsulates complex logic-based algorithms into high-level APIs, which simplify the development and provide a unified interface akin to the one of Spark MLlib. As LLib is fully compatible with DataFrame, it enables the integrated utilization of its built-in applications and new Datalog queries with existing Spark functions, such as those provided by MLlib and Spark SQL. With a variety of examples, we will (i) show how to write programs with LLib to express a variety of applications; (ii) illustrate its user experience in Apache Spark ecosystem; and (iii) present a user-friendly interface to interact with the LLib framework and monitor the query results.

翻译：随着数据数量的不断增加,迫切需要提供明确而有效的工具来支持大数据分析。宣告性逻辑语言数据学已证明非常有效地通过递录查询来表达简明的图表、机器学习和知识发现应用。在这个示范中,我们开发了逻辑图书馆(Lilib),这是数据log中写成的循环算法图书馆,可以在大数据log中执行,这是我们开发的Apache Spark上方的数据引擎。Llib将复杂的逻辑算法包在高级API中,简化开发并提供类似于Spark MLlib的界面。Llib与DataFrame完全兼容,因此它能够综合利用其内在应用程序和新的数据查询与现有的Spark功能(例如MLlib和Spark SQL提供的数据)进行。我们用多种例子(一) 展示如何用Llib来表达各种应用的方案;(二) 说明其在Spark Splark生态系统中的用户经验并提供一个统一的界面;以及(三) 介绍与Apache Spark生态系统的用户界面。

相关内容

Spark

关注 51

Apache Spark 是专为大规模数据处理而设计的快速通用的计算引擎。Spark是UC Berkeley AMP lab (加州大学伯克利分校的AMP实验室)所开源的类Hadoop MapReduce的通用并行框架，Spark，拥有Hadoop MapReduce所具有的优点；但不同于MapReduce的是Job中间输出结果可以保存在内存中，从而不再需要读写HDFS，因此Spark能更好地适用于数据挖掘与机器学习等需要迭代的MapReduce的算法。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日