Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry, and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, often by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We comprehensively review and discuss current representations and relations between them, using a unified mathematical framework based on many-body functions, group averaging, and tensor products. For selected state-of-the-art representations, we compare energy predictions for organic molecules, binary alloys, and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method, and hyper-parameter optimization.
翻译:根据最初的原则对分子和材料进行计算性研究是物理学、化学和材料科学的基石,但受到精确和精确模拟成本的限制。在涉及许多模拟的环境下,机器学习可以降低这些费用,通常通过在参考模拟之间相互调和,通过在参考模拟之间相互调和来降低费用。这需要描述任何分子或材料并支持相互调和。我们利用基于许多身体功能、群体平均和抗拉产品的统一数学框架,全面审查和讨论它们之间的当前表现和关系。对于某些最先进的表述,我们比较了在数据分布、回归法和超参数优化控制的数字实验中对有机分子、二元合金和Al-Ga-In sequix的能源预测。