In this paper, we present BigDL, a distributed deep learning framework for Big Data platforms and workflows. It is implemented on top of Apache Spark, and allows users to write their deep learning applications as standard Spark programs (running directly on large-scale big data clusters in a distributed fashion). It provides an expressive, "data-analytics integrated" deep learning programming model, so that users can easily build the end-to-end analytics + AI pipelines under a unified programming paradigm; by implementing an AllReduce like operation using existing primitives in Spark (e.g., shuffle, broadcast, and in-memory data persistence), it also provides a highly efficient "parameter server" style architecture, so as to achieve highly scalable, data-parallel distributed training. Since its initial open source release, BigDL users have built many analytics and deep learning applications (e.g., object detection, sequence-to-sequence generation, neural recommendations, fraud detection, etc.) on Spark.
翻译:在本文中,我们展示了大数据平台和工作流程的分布式深学习框架BigDL,在Apache Spark的顶端实施,允许用户以标准Spark程序(直接以分布式方式在大型大数据集群上运行)写下深层学习应用程序。它提供了一种直观的“数据分析集成”深层学习编程模型,以便用户能够在统一的编程范式下方便地建立端到端分析器+AI管道;通过在Spark利用现有原始设备(如洗发机、广播和模拟数据持久性)实施全R等操作,它也提供了一个高效的“参数服务器”风格架构,以便实现高度可扩缩的、数据平行分布式培训。自初始开放源发布以来,大DL用户在Spark上建立了许多分析器和深层学习应用程序(如物体探测、序列到后继生成、神经建议、欺诈探测等)。