Scripting languages such as Python and R have been widely adopted as tools for the productive development of scientific software because of the power and expressiveness of the languages and available libraries. However, deploying scripted applications on large-scale parallel computer systems such as the IBM Blue Gene/Q or Cray XE6 is a challenge because of issues including operating system limitations, interoperability challenges, parallel filesystem overheads due to the small file system accesses common in scripted approaches, and other issues. We present here a new approach to these problems in which the Swift scripting system is used to integrate high-level scripts written in Python, R, and Tcl, with native code developed in C, C++, and Fortran, by linking Swift to the library interfaces to the script interpreters. In this approach, Swift handles data management, movement, and marshaling among distributed-memory processes without direct user manipulation of low-level communication libraries such as MPI. We present a technique to efficiently launch scripted applications on large-scale supercomputers using a hierarchical programming model.
翻译:Python 和 R 等脚本语言由于语言和现有图书馆的力量和表达力而被广泛采用,成为科学软件生产发展的工具;然而,在IBM Blue Gene/Q或Cray XE6等大型平行计算机系统上部署脚本应用程序是一项挑战,因为存在操作系统限制、互操作性挑战、由于小文件系统存取在脚本方法中常见的小型文件系统访问而导致的平行文件系统间接费用以及其他问题。我们在此提出了解决这些问题的新办法,即使用Swift脚本系统将用Python、R和Tcl编写的高级脚本与C、C++和Fortran开发的本地代码结合起来,将Swift与图书馆接口与脚本翻译连接起来。在这种方法中,Swift处理数据管理、移动和在分布式模块之间操作,而没有直接用户对诸如MPI等低级通信图书馆进行操纵。我们介绍了一种技术,以便有效地在大型超级计算机上使用等级编程模型,从而将脚本应用到高级计算机上。