Deep-learning models have enabled performance leaps in analysis of high-dimensional functional MRI (fMRI) data. Yet, many previous methods are suboptimally sensitive for contextual representations across diverse time scales. Here, we present BolT, a blood-oxygen-level-dependent transformer model, for analyzing multi-variate fMRI time series. BolT leverages a cascade of transformer encoders equipped with a novel fused window attention mechanism. Encoding is performed on temporally-overlapped windows within the time series to capture local representations. To integrate information temporally, cross-window attention is computed between base tokens in each window and fringe tokens from neighboring windows. To gradually transition from local to global representations, the extent of window overlap and thereby number of fringe tokens are progressively increased across the cascade. Finally, a novel cross-window regularization is employed to align high-level classification features across the time series. Comprehensive experiments on large-scale public datasets demonstrate the superior performance of BolT against state-of-the-art methods. Furthermore, explanatory analyses to identify landmark time points and regions that contribute most significantly to model decisions corroborate prominent neuroscientific findings in the literature.
翻译:深层学习模型使分析高维功能 MRI (fMRI) 数据的工作得以跳跃。 然而,许多先前的方法对于不同时间尺度的背景表达方式具有亚于最高度的敏感性。 这里我们介绍BolT, 一种依赖血液-氧基的变压器模型, 用于分析多变式 FMRI 时间序列。 BolT 利用一系列配有新型集成窗口关注机制的变压器编码器。 在时间序列中,在时间序列中临时覆盖的窗口上进行编码,以捕捉当地代表。 为了将信息整合到时间序列中,每个窗口的基本符号和相邻窗口的边缘符号之间都计算出交叉窗口的关注度。 为了逐渐从地方表达方式过渡到全球,窗口重叠的程度以及边际符号的数量逐渐增加。 最后, 采用了新型的跨窗口调节, 以调整整个时间序列的高等级分类特征。 大规模公共数据集的全面实验表明, BolT 相对于州代表制方法的优异性表现。 此外, 解释性分析, 以辨别标志性的时间点和地区对模型研究中突出的神经科学结论。