Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) global stability also requires the structure to follow the complex, yet specific bonding preferences between different atom types. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We significantly outperform past methods in three tasks: 1) reconstructing the input structure, 2) generating valid, diverse, and realistic materials, and 3) generating materials that optimize a specific property. We also provide several standard datasets and evaluation metrics for the broader machine learning community.
翻译:对材料设计界而言,形成稳定材料的定期结构是一个长期的挑战。这项任务很困难,因为稳定材料只存在于所有可能的原子定期安排的低维次空间中:1)坐标必须位于量子力学界定的当地最低能源水平,2)全球稳定还要求结构遵循复杂但具体联系不同原子类型之间的偏好。现有方法没有纳入这些因素,而且往往缺乏适当的差异。我们提议采用水晶扩散挥发自动coder(CDVAE)来捕捉物质稳定性的物理诱导偏差。通过从稳定材料的数据分发中学习,解码器在扩散过程中生成材料,使原子坐标向较低的能源状态发展,并更新原子类型以满足邻国之间的偏好。我们的模式还明确将相互作用纳入各种周期边界,并尊重变异、翻译、旋转和周期性变异性。我们在三项任务中明显超越了过去的方法:(1) 重建输入结构,(2) 产生有效、多样化和现实的材料,以及(3) 生成能够优化特定财产的原始测量材料。我们还提供了数种标准数据集。