Detecting and quantifying products of cellular metabolism using Mass Spectrometry (MS) has already shown great promise in many biological and biomedical applications. The biggest challenge in metabolomics is annotation, where measured spectra are assigned chemical identities. Despite advances, current methods provide limited annotation for measured spectra. Here, we explore using graph neural networks (GNNs) to predict the spectra. The input to our model is a molecular graph. The model is trained and tested on the NIST 17 LC-MS dataset. We compare our results to NEIMS, a neural network model that utilizes molecular fingerprints as inputs. Our results show that GNN-based models offer higher performance than NEIMS. Importantly, we show that ranking results heavily depend on the candidate set size and on the similarity of the candidates to the target molecule, thus highlighting the need for consistent, well-characterized evaluation protocols for this domain.
翻译:使用质量光谱测量(MS)检测和量化细胞新陈代谢产品已经在许多生物和生物医学应用中显示出巨大的希望。代谢的最大挑战是注解,即测量的光谱被指定为化学特性。尽管取得了进步,但目前的方法为测量的光谱提供了有限的注解。在这里,我们探索使用图形神经网络(GNN)来预测光谱。输入我们的模型是分子图。该模型在NIST 17 LC-MS数据集上进行了培训和测试。我们把我们的结果与神经网络模型(NEIMS)进行了比较,NEIMS是使用分子指纹作为输入的神经网络模型。我们的结果显示,基于GNN的模型提供比NIMS更高的性能。重要的是,我们显示排名结果在很大程度上取决于候选方设定的尺寸和候选方与目标分子的相似性,从而突显了这一领域需要一致、精细化的评价程序。