A new semi-supervised machine learning method for the discovery of structure-spectrum relationships is developed and demonstrated using the specific example of interpreting X-ray absorption near-edge structure (XANES) spectra. This method constructs a one-to-one mapping between individual structure descriptors and spectral trends. Specifically, an adversarial autoencoder is augmented with a novel rank constraint (RankAAE). The RankAAE methodology produces a continuous and interpretable latent space, where each dimension can track an individual structure descriptor. As a part of this process, the model provides a robust and quantitative measure of the structure-spectrum relationship by decoupling intertwined spectral contributions from multiple structural characteristics. This makes it ideal for spectral interpretation and the discovery of new descriptors. The capability of this procedure is showcased by considering five local structure descriptors and a database of over fifty thousand simulated XANES spectra across eight first-row transition metal oxide families. The resulting structure-spectrum relationships not only reproduce known trends in the literature, but also reveal unintuitive ones that are visually indiscernible in large data sets. The results suggest that the RankAAE methodology has great potential to assist researchers to interpret complex scientific data, test physical hypotheses, and reveal new patterns that extend scientific insight.
翻译:为发现结构频谱关系,开发了一个新的半监督的机器学习方法,并使用解释X射线吸收近视结构(XANES)光谱的具体实例展示了该方法。该方法在单个结构描述器和光谱趋势之间建立了一对一的映射图。具体地说,一个对抗性自动编码器增加了一个新的等级限制(RankAAAE)。RankAAAE方法产生了一个连续和可解释的潜在空间,每个维度都可跟踪单个结构描述器。作为这一过程的一部分,该模型通过从多个结构特征中分解相互交错的光谱图谱贡献,提供了结构特征关系的有力和定量的测量度。这为光谱解释和发现新的描述器提供了理想。通过考虑五个地方结构描述器和一个由八个第一波段过渡的金属氧化物族组成的5万多个模拟XAANES光谱库,展示了这一程序的能力。由此形成的结构-频谱关系不仅复制了文献中已知的趋势,而且还揭示了不直观的光谱谱关系。这为物理学和深层次的模型提供了一种潜在的科学分析结果。