With the advent of multi-core systems, GPUs and FPGAs, loop parallelization has become a promising way to speed-up program execution. In order to stay up with time, various performance-oriented programming languages provide a multitude of constructs to allow programmers to write parallelizable loops. Correspondingly, researchers have developed techniques to automatically parallelize loops that do not carry dependences across iterations, and/or call pure functions. However, in managed languages with platform-independent runtimes such as Java, it is practically infeasible to perform complex dependence analysis during JIT compilation. In this paper, we propose AutoTornado, a first of its kind static+JIT loop parallelizer for Java programs that parallelizes loops for heterogeneous architectures using TornadoVM (a Graal-based VM that supports insertion of @Parallel constructs for loop parallelization). AutoTornado performs sophisticated dependence and purity analysis of Java programs statically, in the Soot framework, to generate constraints encoding conditions under which a given loop can be parallelized. The generated constraints are then fed to the Z3 theorem prover (which we have integrated with Soot) to annotate canonical for loops that can be parallelized using the @Parallel construct. We have also added runtime support in TornadoVM to use static analysis results for loop parallelization. Our evaluation over several standard parallelization kernels shows that AutoTornado correctly parallelizes 61.3% of manually parallelizable loops, with an efficient static analysis and a near-zero runtime overhead. To the best of our knowledge, AutoTornado is not only the first tool that performs program-analysis based parallelization for a real-world JVM, but also the first to integrate Z3 with Soot for loop parallelization.
翻译:随着多核心系统、 JPU 和 FPGA 的出现, 循环平行化已成为加速程序执行的一个充满希望的方法。 为了跟上时间, 各种面向业绩的编程语言提供了多种构造, 使程序员能够写入平行循环。 与此相应的是, 研究人员开发了自动平行循环的技术, 这些循环不会在迭代中产生依赖性, 并且/ 或者调用纯功能。 然而, 在平台独立的运行时间( 如 Java) 的管理语言中, 在 Java 编译时, 进行复杂的依赖性分析实际上不易平行。 在本文中, 我们建议 AutTornado, 这是它的第一个类似型的静态+JIT 循环平行平行化程序, 使程序程序程序能够平行写入可平行的循环。 以 Graal为基的 VM 支持插入 循环, 并且 将 数字工具的自动同步化分析, 在 Sotrod 框架中, 将快速化的自动自动自动化和纯化分析环境, 使得我们能够同步化的循环化。 运行一个常规化程序。