MoRE-Brain：基于路由专家混合模型的可解释且泛化性强的跨被试fMRI视觉解码 (MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding)

Decoding visual experiences from fMRI offers a powerful avenue to understand human perception and develop advanced brain-computer interfaces. However, current progress often prioritizes maximizing reconstruction fidelity while overlooking interpretability, an essential aspect for deriving neuroscientific insight. To address this gap, we propose MoRE-Brain, a neuro-inspired framework designed for high-fidelity, adaptable, and interpretable visual reconstruction. MoRE-Brain uniquely employs a hierarchical Mixture-of-Experts architecture where distinct experts process fMRI signals from functionally related voxel groups, mimicking specialized brain networks. The experts are first trained to encode fMRI into the frozen CLIP space. A finetuned diffusion model then synthesizes images, guided by expert outputs through a novel dual-stage routing mechanism that dynamically weighs expert contributions across the diffusion process. MoRE-Brain offers three main advancements: First, it introduces a novel Mixture-of-Experts architecture grounded in brain network principles for neuro-decoding. Second, it achieves efficient cross-subject generalization by sharing core expert networks while adapting only subject-specific routers. Third, it provides enhanced mechanistic insight, as the explicit routing reveals precisely how different modeled brain regions shape the semantic and spatial attributes of the reconstructed image. Extensive experiments validate MoRE-Brain's high reconstruction fidelity, with bottleneck analyses further demonstrating its effective utilization of fMRI signals, distinguishing genuine neural decoding from over-reliance on generative priors. Consequently, MoRE-Brain marks a substantial advance towards more generalizable and interpretable fMRI-based visual decoding. Code will be publicly available soon: https://github.com/yuxiangwei0808/MoRE-Brain.

翻译：从功能磁共振成像（fMRI）中解码视觉体验，为理解人类感知和发展先进的脑机接口提供了强大途径。然而，当前的研究进展往往优先追求重建保真度的最大化，而忽视了可解释性——这是获得神经科学洞见的关键方面。为弥补这一不足，我们提出了MoRE-Brain，这是一个受神经科学启发的框架，旨在实现高保真、适应性强且可解释的视觉重建。MoRE-Brain独特地采用了一种分层的专家混合架构，其中不同的专家处理来自功能相关体素组的fMRI信号，模拟了专门的大脑网络。首先，训练这些专家将fMRI编码到冻结的CLIP空间中。随后，一个经过微调的扩散模型合成图像，其过程由专家输出通过一种新颖的双阶段路由机制引导，该机制在扩散过程中动态权衡各专家的贡献。MoRE-Brain带来了三项主要进展：首先，它引入了一种基于大脑网络原理、专为神经解码设计的新型专家混合架构。其次，它通过共享核心专家网络并仅调整特定于被试的路由器，实现了高效的跨被试泛化。第三，它提供了增强的机制性洞见，因为显式的路由机制精确揭示了不同建模脑区如何影响重建图像的语义和空间属性。大量实验验证了MoRE-Brain的高重建保真度，瓶颈分析进一步证明了其有效利用fMRI信号的能力，从而将真正的神经解码与过度依赖生成先验区分开来。因此，MoRE-Brain标志着基于fMRI的视觉解码在迈向更高泛化性和可解释性方面取得了实质性进展。代码即将公开：https://github.com/yuxiangwei0808/MoRE-Brain。