Scientific computing applications have benefited greatly from high performance computing infrastructure such as supercomputers. However, we are seeing a paradigm shift in the computational structure, design, and requirements of these applications. Increasingly, data-driven and machine learning approaches are being used to support, speed-up, and enhance scientific computing applications, especially molecular dynamics simulations. Concurrently, cloud computing platforms are increasingly appealing for scientific computing, providing "infinite" computing powers, easier programming and deployment models, and access to computing accelerators such as TPUs (Tensor Processing Units). This confluence of machine learning (ML) and cloud computing represents exciting opportunities for cloud and systems researchers. ML-assisted molecular dynamics simulations are a new class of workload, and exhibit unique computational patterns. These simulations present new challenges for low-cost and high-performance execution. We argue that transient cloud resources, such as low-cost preemptible cloud VMs, can be a viable platform for this new workload. Finally, we present some low-hanging fruits and long-term challenges in cloud resource management, and the integration of molecular dynamics simulations into ML platforms (such as TensorFlow).
翻译:然而,我们看到计算结构、设计和这些应用要求的范式转变。数据驱动和机器学习方法正越来越多地用于支持、加速和增强科学计算应用,特别是分子动态模拟。与此同时,云计算平台日益吸引科学计算,提供了“无限”的计算能力、较简单的编程和部署模型,并有机会使用诸如TPUs(传感器处理单位)等计算机加速器。机器学习(ML)和云计算对云和系统研究人员来说是一种令人兴奋的机会。ML辅助分子动态模拟是一种新的工作量类别,并展示了独特的计算模式。这些模拟为低成本和高性能执行提出了新的挑战。我们认为,中性云资源,例如低成本的预发性云甚高的云层VMMMM,可以成为这一新工作量的可行平台。最后,我们介绍了云资源管理中的低挂式水果和长期挑战,以及分子动态模拟纳入ML平台(如TensFlow)。