Brain simulation, as one of the latest advances in artificial intelligence, facilitates better understanding about how information is represented and processed in the brain. The extreme complexity of human brain makes brain simulations only feasible upon high-performance computing platforms. Supercomputers with a large number of interconnected graphical processing units (GPUs) are currently employed for supporting brain simulations. Therefore, high-throughput low-latency inter-GPU communications in supercomputers play a crucial role in meeting the performance requirements of brain simulation as a highly time-sensitive application. In this paper, we first provide an overview of the current parallelizing technologies for brain simulations using multi-GPU architectures. Then, we analyze the challenges to communications for brain simulation and summarize guidelines for communication design to address such challenges. Furthermore, we propose a partitioning algorithm and a two-level routing method to achieve efficient low-latency communications in multi-GPU architecture for brain simulation. We report experiment results obtained on a supercomputer with 2,000 GPUs for simulating a brain model with 10 billion neurons to show that our approach can significantly improve communication performance. We also discuss open issues and identify some research directions for low-latency communication design for brain simulations.
翻译:大脑模拟是人工智能的最新进步之一,它有助于更好地了解大脑中如何表现和处理信息。人类大脑的极端复杂性使得大脑模拟只有在高性能计算机平台上才可行。目前,使用大量相互关联的图形处理器(GPUs)的超级计算机支持大脑模拟。因此,超级计算机中高通量低纬度GPU通信在满足大脑模拟作为高度时间敏感应用的性能要求方面发挥着关键作用。在本文中,我们首先概述了目前利用多功能计算机结构进行大脑模拟的平行技术。然后,我们分析了大脑模拟的通信挑战,并总结了应对此类挑战的通信设计准则。此外,我们建议采用一种隔热算法和双层路由法,以便在多功能计算机结构中实现高效的低纬度通信。我们报告用2 000 GPUPUs的超级计算机为100亿个神经元模拟大脑模型模拟而获得的实验结果,以显示我们的方法可以显著改进通信绩效。我们还讨论开放式问题,并找出低频通信模拟设计的一些研究方向。