Electroencephalography (EEG) is a neuroimaging technique that records brain neural activity with high temporal resolution. Unlike other methods, EEG does not require prohibitively expensive equipment and can be easily set up using commercially available portable EEG caps, making it an ideal candidate for brain-computer interfaces. However, EEG signals are characterised by poor spatial resolution and high noise levels, complicating their decoding. In this study, we employ a contrastive learning framework to align encoded EEG features with pretrained CLIP features, achieving a 7% improvement over the state-of-the-art in EEG decoding of object categories. This enhancement is equally attributed to (1) a novel online sampling method that boosts the signal-to-noise ratio and (2) multimodal representations leveraging visual and language features to enhance the alignment space. Our analysis reveals a systematic interaction between the architecture and dataset of pretrained features and their alignment efficacy for EEG signal decoding. This interaction correlates with the generalisation power of the pretrained features on ImageNet-O/A datasets ($r=.5$). These findings extend beyond EEG signal alignment, offering potential for broader applications in neuroimaging decoding and generic feature alignments.
翻译:暂无翻译