C is the lingua franca of programming and almost any device can be programmed using C. However, programming mod-ern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as device-specific properties such as memory hierarchies. The resulting code is often hard to understand, debug, and modify for different architectures. We propose to lift C pro-grams to a parametric dataflow representation that lends itself to static data-centric analysis and enables automatic high-performance code generation. We separate writing code from optimizing for different hardware: simple, portable C source code is used to generate efficient specialized versions with a click of a button. Our approach can identify parallelism when no other compiler can, and outperforms a bespoke parallelized version of a scientific proxy application by up to21%.
翻译:C 是编程的通用语, 几乎任何设备都可以用 C 来编程。 但是, 编程模式式的多元结构, 如多核心 CPUs 和 GPUs, 需要明确表达平行性以及记忆级等设备特性。 由此产生的代码往往很难理解、 调试和修改不同的结构。 我们提议将 C 编程提升为参数数据流代表, 从而能够进行静态的数据中心分析, 并能够自动生成高性能代码 。 我们将写代码与优化不同硬件区分开来: 使用简单、 便携式 C 源代码, 点击按钮来生成高效的专门版本 。 我们的方法可以在其他编译器无法使用的情况下识别平行性, 并且超越科学代理应用的直言的平行版本, 最多为 21% 。