Molecular property calculations are the bedrock of chemical physics. High-fidelity \textit{ab initio} modeling techniques for computing the molecular properties can be prohibitively expensive, and motivate the development of machine-learning models that make the same predictions more efficiently. Training graph neural networks over large molecular databases introduces unique computational challenges such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs such as social networks. This paper demonstrates a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction. We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory associated with alternative padding techniques and improve throughput via minimizing communication. We demonstrate the effectiveness of our co-design approach by providing an implementation of a well-established molecular property prediction model on the Graphcore Intelligence Processing Units (IPU). We evaluate the training performance on multiple molecular graph databases with varying degrees of graph counts, sizes and sparsity. We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours, opening new possibilities for AI-driven scientific discovery.
翻译:分子属性计算是化学物理的基石。 用于计算分子特性的高纤维化 \ textit{ab initio} 模型技术可能极其昂贵,并激励开发使预测效率更高的机器学习模型。 大型分子数据库的培训图形神经网络带来了独特的计算挑战,例如需要处理数百万个与社会网络等大型图表不同的、大小不一和支持通信模式的小图表。 本文展示了一种新型硬件软件共同设计方法,以扩大对分子属性预测的图形神经网络的培训。 我们引入了一种算法,将一批分子图合并成固定尺寸的包,以消除与替代垫板技术相关的多余计算和记忆,并通过尽量减少通信来改进吞吐量。 我们通过在图形核心情报处理单位(IMU)实施一个完善的分子属性预测模型,证明了我们共同设计方法的有效性。 我们评估了多个分子图形数据库的培训绩效,这些数据库的图形数量、大小和孔度各不相同。 我们引入了一种算法,将分子分子图的批量合并成从固定尺寸和记忆组合成一组,消除与替代的计算技术相关的多余的计算和记忆,并通过尽量减少通信的模型进行新的科学研究发现方法。 我们证明我们所创的预测了两个时间可以减少新的时间。