With the ever-increasing volume of data, there is an urgent need to provide expressive and efficient tools to support Big Data analytics. The declarative logical language Datalog has proven very effective at expressing concisely graph, machine learning, and knowledge discovery applications via recursive queries. In this demonstration, we develop Logic Library (LLib), a library of recursive algorithms written in Datalog that can be executed in BigDatalog, a Datalog engine on top of Apache Spark developed by us. LLib encapsulates complex logic-based algorithms into high-level APIs, which simplify the development and provide a unified interface akin to the one of Spark MLlib. As LLib is fully compatible with DataFrame, it enables the integrated utilization of its built-in applications and new Datalog queries with existing Spark functions, such as those provided by MLlib and Spark SQL. With a variety of examples, we will (i) show how to write programs with LLib to express a variety of applications; (ii) illustrate its user experience in Apache Spark ecosystem; and (iii) present a user-friendly interface to interact with the LLib framework and monitor the query results.
翻译:随着数据数量的不断增加,迫切需要提供明确而有效的工具来支持大数据分析。宣告性逻辑语言数据学已证明非常有效地通过递录查询来表达简明的图表、机器学习和知识发现应用。在这个示范中,我们开发了逻辑图书馆(Lilib),这是数据log中写成的循环算法图书馆,可以在大数据log中执行,这是我们开发的Apache Spark上方的数据引擎。Llib将复杂的逻辑算法包在高级API中,简化开发并提供类似于Spark MLlib的界面。Llib与DataFrame完全兼容,因此它能够综合利用其内在应用程序和新的数据查询与现有的Spark功能(例如MLlib和Spark SQL提供的数据)进行。我们用多种例子(一) 展示如何用Llib来表达各种应用的方案;(二) 说明其在Spark Splark生态系统中的用户经验并提供一个统一的界面;以及(三) 介绍与Apache Spark生态系统的用户界面。