The use of machine learning is becoming increasingly common in computational materials science. To build effective models of the chemistry of materials, useful machine-based representations of atoms and their compounds are required. We derive distributed representations of compounds from their chemical formulas only, via pooling operations of distributed representations of atoms. These compound representations are evaluated on ten different tasks, such as the prediction of formation energy and band gap, and are found to be competitive with existing benchmarks that make use of structure, and even superior in cases where only composition is available. Finally, we introduce a new approach for learning distributed representations of atoms, named SkipAtom, which makes use of the growing information in materials structure databases.
翻译:为了建立材料化学有效模型,需要用机器对原子及其化合物进行有用的展示。我们通过集中利用分布式原子的展示,从化学方程式中获取各种化合物的分布式表述。这些组合式表述按十项不同任务进行评估,例如预测形成能量和波段间的差距,并发现与使用结构的现有基准相比具有竞争力,在只有组成成分的情况下甚至优于现有基准。最后,我们采用了一种学习分布式原子(称为SkippAtom)的表述的新方法,该方法利用材料结构数据库中越来越多的信息。