Molecular dynamics (MD) simulation, a computationally intensive method that provides invaluable insights into the behavior of biomolecules, typically requires large-scale parallelization. Implementation of fast parallel MD simulation demands both high bandwidth and low latency for inter-node communication, but in current semiconductor technology, neither of these properties is scaling as quickly as intra-node computational capacity. This disparity in scaling necessitates architectural innovations to maximize the utilization of computational units. For Anton 3, the latest in a family of highly successful special-purpose supercomputers designed for MD simulations, we thus designed and built a completely new specialized network as part of our ASIC. Tightly integrating this network with specialized computation pipelines enables Anton 3 to perform simulations orders of magnitude faster than any general-purpose supercomputer, and to outperform its predecessor, Anton 2 (the state of the art prior to Anton 3), by an order of magnitude. In this paper, we present the three key features of the network that contribute to the high performance of Anton 3. First, through architectural optimizations, the network achieves very low end-to-end inter-node communication latency for fine-grained messages, allowing for better overlap of computation and communication. Second, novel application-specific compression techniques reduce the size of most messages sent between nodes, thereby increasing effective inter-node bandwidth. Lastly, a new hardware synchronization primitive, called a network fence, supports fast fine-grained synchronization tailored to the data flow within a parallel MD application. These application-driven specializations to the network are critical for Anton 3's MD simulation performance advantage over all other machines.
翻译:分子动态( MD) 模拟( MD) 模拟( 分子动态( MD) ) 是一种计算密集的方法, 它为生物分子的行为提供了非常宝贵的洞察力, 通常需要大规模平行的 快速平行的 MD 模拟( 快速平行的MD 模拟) 需要高带宽和低悬浮来进行节点交流, 但是在目前的半导体技术中, 这两种特性都没有像节点内部计算能力那样快速缩放。 这种规模的差别需要建筑创新来最大限度地利用计算单位。 对于 Anton 3 来说, 这是为MDM 模拟设计的高度成功的特殊用途超级计算机中的最新组合。 因此, 我们设计并建造了一个全新的平行的专门网络。 将这个网络与专门的计算管道紧密地整合起来, 使Anton 3 3 模拟的模拟规模比任何通用超级计算机都快, 超越了它的前身 Anton 2 ( Anton 3 之前的艺术状态 ) 。 在本文中, 我们展示了这个网络的三个关键特征, 通过建筑优化的优化的优化, 网络可以实现非常低端端端至端的流流化的同步的网络应用, 使最高级的 快速的网络 快速的网络 快速的循环的网络 。