Matrix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-MS) is a cornerstone in biomolecular analysis, offering precise identification of pathogens through unique mass spectral signatures. Yet, its reliance on labor-intensive sample preparation and multi-shot spectral averaging restricts its use to laboratory settings, rendering it impractical for real-time environmental monitoring. These limitations are especially pronounced in emerging aerosol MALDI-MS systems, where autonomous sampling generates noisy spectra for unknown aerosol analytes, requiring single-shot detection for effective analysis. Addressing these challenges, we propose the Mass Spectral Dictionary-Guided Transformer (MS-DGFormer): a data-driven framework that redefines spectral analysis by directly processing raw, minimally prepared mass spectral data. MS-DGFormer leverages a transformer architecture, designed to capture the long-range dependencies inherent in these time-series spectra. To enhance feature extraction, we introduce a novel dictionary encoder that integrates denoised spectral information derived from Singular Value Decomposition (SVD), enabling the model to discern critical biomolecular patterns from single-shot spectra with robust performance. This innovation provides a system to achieve superior pathogen identification from aerosol samples, facilitating autonomous, real-time analysis in field conditions. By eliminating the need for extensive preprocessing, our method unlocks the potential for portable, deployable MALDI-MS platforms, revolutionizing environmental pathogen detection and rapid response to biological threats.
翻译:基质辅助激光解吸/电离质谱(MALDI-MS)是生物分子分析领域的基石技术,通过独特的质谱特征实现病原体的精确鉴定。然而,该方法依赖于劳动密集型的样品制备和多谱图平均处理,限制了其在实验室环境中的应用,难以满足实时环境监测的需求。这些局限性在新兴的气溶胶MALDI-MS系统中尤为突出——自主采样会为未知气溶胶分析物生成噪声谱图,而有效分析需要单次谱图检测能力。为应对这些挑战,我们提出质谱词典引导Transformer(MS-DGFormer):一种数据驱动框架,通过直接处理原始、最低限度制备的质谱数据,重新定义了谱图分析范式。MS-DGFormer采用Transformer架构,专门设计用于捕捉时间序列谱图中固有的长程依赖关系。为增强特征提取能力,我们引入新型词典编码器,该编码器整合了基于奇异值分解(SVD)的去噪谱图信息,使模型能够以稳健性能从单次谱图中识别关键生物分子模式。这项创新构建了从气溶胶样本实现卓越病原体识别的系统,为现场环境中的自主实时分析提供了支持。通过消除大量预处理需求,我们的方法释放了便携式可部署MALDI-MS平台的潜力,为环境病原体检测和生物威胁快速响应带来革命性突破。