Probabilistic Programming Languages (PPLs) are a powerful tool in machine learning, allowing highly expressive generative models to be expressed succinctly. They couple complex inference algorithms, implemented by the language, with an expressive modelling language that allows a user to implement any computable function as the generative model. Such languages are usually implemented on top of existing high level programming languages and do not make use of hardware accelerators. PPLs that do make use of accelerators exist, but restrict the expressivity of the language in order to do so. In this paper, we present a language and toolchain that generates highly efficient code for both CPUs and GPUs. The language is functional in style, and the tool chain is built on top of LLVM. Our implementation uses de-limited continuations on CPU to perform inference, and custom CUDA codes on GPU. We obtain significant speed ups across a suite of PPL workloads, compared to other state of the art approaches on CPU. Furthermore, our compiler can also generate efficient code that runs on CUDA GPUs.
翻译:概率性编程语言(PPLs)是机器学习的有力工具,它使高度直观的基因模型能够简明地表达出来。它们结合了由语言实施的复杂的推论算法,由语言实施,并配有一种表达式模拟语言,使用户能够将任何可计算功能作为基因模型。这些语言通常是在现有高水平编程语言之外实施的,不使用硬件加速器。使用加速器的PPLs确实存在,但为了做到这一点限制了语言的表达性。在本文中,我们提出了一个语言和工具链,为CPUs和GPUs生成了高效的代码。语言在风格上发挥作用,工具链建在LLLVM顶端。我们的实施在CPU上使用有一定的延续功能来进行推断,在GPU上使用自定义的CUDA代码。我们得到了一套PL工作量的高速提升,而与CPU的其他状态方法相比。此外,我们的编程器还可以生成在CUDA GPUs上运行的有效代码。