自我监督的多模式多米诺:寻找阿尔茨海默氏病生物标志 (Self-Supervised Multimodal Domino: in Search of Biomarkers for Alzheimer's Disease)

Sensory input from multiple sources is crucial for robust and coherent human perception. Different sources contribute complementary explanatory factors. Similarly, research studies often collect multimodal imaging data, each of which can provide shared and unique information. This observation motivated the design of powerful multimodal self-supervised representation-learning algorithms. In this paper, we unify recent work on multimodal self-supervised learning under a single framework. Observing that most self-supervised methods optimize similarity metrics between a set of model components, we propose a taxonomy of all reasonable ways to organize this process. We first evaluate models on toy multimodal MNIST datasets and then apply them to a multimodal neuroimaging dataset with Alzheimer's disease patients. We find that (1) multimodal contrastive learning has significant benefits over its unimodal counterpart, (2) the specific composition of multiple contrastive objectives is critical to performance on a downstream task, (3) maximization of the similarity between representations has a regularizing effect on a neural network, which can sometimes lead to reduced downstream performance but still reveal multimodal relations. Results show that the proposed approach outperforms previous self-supervised encoder-decoder methods based on canonical correlation analysis (CCA) or the mixture-of-experts multimodal variational autoEncoder (MMVAE) on various datasets with a linear evaluation protocol. Importantly, we find a promising solution to uncover connections between modalities through a jointly shared subspace that can help advance work in our search for neuroimaging biomarkers.

翻译：从多种来源获得的感官投入对于人类认识的稳健和连贯至关重要。不同来源的感官投入有助于解释性因素。同样,研究往往收集多式成像数据,每个数据都可以提供共享和独特的信息。这一观察促使设计了强大的多式自我监督的代表学习算法。在本文件中,我们统一了最近关于多式自我监督学习的工作,在一个单一的框架内。注意到大多数自监督的方法优化了一组模型组成部分之间的相似度度度量度,我们建议对所有合理的方法进行分类,以组织这一进程。我们首先评价玩具多式联运MNIST数据集的模型,然后将这些模型应用到一个与阿尔茨海默氏病病人的多式神经成像数据集。我们发现:(1) 多式联运对比性学习比其单一式对应方法大得多,(2) 多重对比性目标的具体构成对于下游任务的业绩至关重要,(3) 尽量扩大各种表述之间的相似性能对神经网络产生常规效应,这有时会导致下游业绩下降,但仍能揭示多式联运关系。结果显示,拟议的方法会超越了以前自我监督的多式神经成型神经成型成型的内置的内脏关系,而我们通过共同的摩地变的内联式的内联式的内联式数据分析。