Reconstructing force fields (FF) from atomistic simulation data is a challenge since accurate data can be highly expensive. Here, machine learning (ML) models can help to be data economic as they can be successfully constrained using the underlying symmetry and conservation laws of physics. However, so far, every descriptor newly proposed for an ML model has required a cumbersome and mathematically tedious remodeling. We therefore propose to use modern techniques from algorithmic differentiation within the ML modeling process -- effectively enabling the usage of novel descriptors or models fully automatically at an order of magnitude higher computational efficiency. This paradigmatic approach enables not only a versatile usage of novel representations, the efficient computation of larger systems -- all of high value to the FF community -- but also the simple inclusion of further physical knowledge such as higher-order information (e.g.~Hessians, more complex partial differential equations constraints etc.), even beyond the presented FF domain.
翻译:从原子模拟数据中重建力量字段是一项挑战,因为准确的数据可能非常昂贵。在这里,机器学习模型可以帮助实现数据经济性,因为使用物理基本对称法和保存法可以成功地限制这些模型。然而,迄今为止,为ML模型新提议的每一个描述符都要求进行繁琐和数学上乏味的改造。因此,我们提议在ML建模过程中使用来自算法差异的现代技术 -- -- 有效地使新的描述符或模型在更高的计算效率水平上完全自动地使用。这种模式化方法不仅能够多功能地使用新的表示法,有效地计算更大的系统 -- -- 所有这些都对FF社区具有很高价值 -- -- 而且可以简单地纳入更高级的物理知识,例如更高顺序的信息(例如:~Hesians,更复杂的部分差异方程式限制等),甚至超出所提出的FF域。