Particle-based modeling of materials at atomic scale plays an important role in the development of new materials and understanding of their properties. The accuracy of particle simulations is determined by interatomic potentials, which allow to calculate the potential energy of an atomic system as a function of atomic coordinates and potentially other properties. First-principles-based ab initio potentials can reach arbitrary levels of accuracy, however their aplicability is limited by their high computational cost. Machine learning (ML) has recently emerged as an effective way to offset the high computational costs of ab initio atomic potentials by replacing expensive models with highly efficient surrogates trained on electronic structure data. Among a plethora of current methods, symbolic regression (SR) is gaining traction as a powerful "white-box" approach for discovering functional forms of interatomic potentials. This contribution discusses the role of symbolic regression in Materials Science (MS) and offers a comprehensive overview of current methodological challenges and state-of-the-art results. A genetic programming-based approach for modeling atomic potentials from raw data (consisting of snapshots of atomic positions and associated potential energy) is presented and empirically validated on ab initio electronic structure data.
翻译:粒子模拟的准确性是由中间学潜力决定的,这种潜力可以计算原子系统作为原子坐标和潜在其他特性的功能的潜在能量。以原则为基础的初始潜力可能达到任意的精确度,然而其可复制性却受到其高计算成本的限制。机器学习(ML)最近成为一种有效的方法,通过用经过电子结构数据培训的高效替代器取代昂贵的原生潜力计算费用高昂的计算成本。在大量现行方法中,象征性回归(SR)正在获得牵引,成为发现各种共生潜力的功能形式的“白箱”方法。这一贡献讨论了物质科学中象征性回归的作用,并全面概述了目前的方法挑战和最新结果。基于基因方案编制的模型利用原始数据(原子位置和相关能源潜力的近似特征)建立原子潜力模型的方法(在电子结构中,对原子位置和相关能源潜力的近似近)和实验性验证数据结构。