REVE：一种适用于脑电图的基础模型——通过25,000名受试者的大规模预训练适应任意采集配置 (REVE: A Foundation Model for EEG -- Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects)

Foundation models have transformed AI by reducing reliance on task-specific data through large-scale pretraining. While successful in language and vision, their adoption in EEG has lagged due to the heterogeneity of public datasets, which are collected under varying protocols, devices, and electrode configurations. Existing EEG foundation models struggle to generalize across these variations, often restricting pretraining to a single setup, resulting in suboptimal performance, in particular under linear probing. We present REVE (Representation for EEG with Versatile Embeddings), a pretrained model explicitly designed to generalize across diverse EEG signals. REVE introduces a novel 4D positional encoding scheme that enables it to process signals of arbitrary length and electrode arrangement. Using a masked autoencoding objective, we pretrain REVE on over 60,000 hours of EEG data from 92 datasets spanning 25,000 subjects, representing the largest EEG pretraining effort to date. REVE achieves state-of-the-art results on 10 downstream EEG tasks, including motor imagery classification, seizure detection, sleep staging, cognitive load estimation, and emotion recognition. With little to no fine-tuning, it demonstrates strong generalization, and nuanced spatio-temporal modeling. We release code, pretrained weights, and tutorials to support standardized EEG research and accelerate progress in clinical neuroscience.

翻译：基础模型通过大规模预训练减少了对任务特定数据的依赖，从而改变了人工智能领域。尽管在语言和视觉领域取得了成功，但由于公共脑电图数据集存在异质性（这些数据是在不同协议、设备和电极配置下采集的），基础模型在脑电图领域的应用一直滞后。现有的脑电图基础模型难以泛化到这些变化，通常将预训练限制在单一配置下，导致性能欠佳，尤其是在线性探测任务中。我们提出了REVE（具有通用嵌入的脑电图表征模型），这是一种经过预训练的模型，专门设计用于泛化处理多样化的脑电图信号。REVE引入了一种新颖的4D位置编码方案，使其能够处理任意长度和任意电极排列的信号。通过使用掩码自编码目标，我们在来自92个数据集、涵盖25,000名受试者的超过60,000小时的脑电图数据上对REVE进行了预训练，这是迄今为止规模最大的脑电图预训练工作。REVE在10项下游脑电图任务上取得了最先进的结果，包括运动想象分类、癫痫发作检测、睡眠分期、认知负荷估计和情绪识别。在极少甚至无需微调的情况下，它展现出强大的泛化能力和精细的时空建模能力。我们发布了代码、预训练权重和教程，以支持标准化的脑电图研究，并加速临床神经科学的进展。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日