Affective computing with Electroencephalogram (EEG) is a challenging task that requires cumbersome models to effectively learn the information contained in large-scale EEG signals, causing difficulties for real-time smart-device deployment. In this paper, we propose a novel knowledge distillation pipeline to distill EEG representations via capsule-based architectures for both classification and regression tasks. Our goal is to distill information from a heavy model to a lightweight model for subject-specific tasks. To this end, we first pre-train a large model (teacher network) on large number of training samples. Then, we employ the teacher network to learn the discriminative features embedded in capsules by adopting a lightweight model (student network) to mimic the teacher using the privileged knowledge. Such privileged information learned by the teacher contain similarities among capsules and are only available during the training stage of the student network. We evaluate the proposed architecture on two large-scale public EEG datasets, showing that our framework consistently enables student networks with different compression ratios to effectively learn from the teacher, even when provided with limited training samples. Lastly, our method achieves state-of-the-art results on one of the two datasets.
翻译:使用电子脑电图(EEG)进行情感计算是一项具有挑战性的任务,要求采用繁琐的模式来有效学习大型电子脑图信号中所含的信息,这给实时智能装置的部署造成困难。在本文中,我们提出一个新的知识蒸馏管道,通过基于胶囊的分类和回归任务结构来提炼EEG的代表性。我们的目标是将信息从一个重模型提取到一个轻量级模型,用于具体主题任务。为此,我们首先对大量培训样本进行大型模型(教师网络)的预培训。然后,我们利用教师网络学习胶囊中所含的歧视特征,采用轻量模型(学生网络)来利用特权知识模拟教师。教师所学的这类特许信息含有胶囊的相似性,并且只是在学生网络的培训阶段才提供。我们评估了两个大型公共电子脑电图数据集的拟议结构,表明我们的框架始终使具有不同压缩比例的学生网络能够有效地从教师那里学习,即使提供有限的培训样本。最后,我们的方法实现了两种数据中的状态。