Decoding speech from non-invasive brain signals, such as electroencephalography (EEG), has the potential to advance brain-computer interfaces (BCIs), with applications in silent communication and assistive technologies for individuals with speech impairments. However, EEG-based speech decoding faces major challenges, such as noisy data, limited datasets, and poor performance on complex tasks like speech perception. This study attempts to address these challenges by employing variational autoencoders (VAEs) for EEG data augmentation to improve data quality and applying a state-of-the-art (SOTA) sequence-to-sequence deep learning architecture, originally successful in electromyography (EMG) tasks, to EEG-based speech decoding. Additionally, we adapt this architecture for word classification tasks. Using the Brennan dataset, which contains EEG recordings of subjects listening to narrated speech, we preprocess the data and evaluate both classification and sequence-to-sequence models for EEG-to-words/sentences tasks. Our experiments show that VAEs have the potential to reconstruct artificial EEG data for augmentation. Meanwhile, our sequence-to-sequence model achieves more promising performance in generating sentences compared to our classification model, though both remain challenging tasks. These findings lay the groundwork for future research on EEG speech perception decoding, with possible extensions to speech production tasks such as silent or imagined speech.
翻译:从非侵入性脑信号(如脑电图)中解码语音,有望推动脑机接口的发展,在无声通信和言语障碍患者的辅助技术中具有应用前景。然而,基于脑电图的语音解码面临诸多挑战,包括数据噪声大、数据集有限,以及在语音感知等复杂任务上性能不佳。本研究尝试通过采用变分自编码器进行脑电数据增强以提升数据质量,并将一种在肌电图任务中已取得成功的先进序列到序列深度学习架构应用于脑电语音解码,以应对这些挑战。此外,我们还针对词语分类任务对该架构进行了适配。利用包含受试者聆听叙述性语音时脑电记录的Brennan数据集,我们对数据进行了预处理,并评估了用于脑电到词语/句子任务的分类模型与序列到序列模型。实验表明,变分自编码器具备重建人工脑电数据以进行增强的潜力。同时,与分类模型相比,我们的序列到序列模型在生成句子方面取得了更有前景的性能,尽管这两项任务均具挑战性。这些发现为未来脑电语音感知解码研究奠定了基础,并可能扩展到无声或想象语音等语音产生任务。