In the Machine Learning (ML) literature, a well-known problem is the Dataset Shift problem where, differently from the ML standard hypothesis, the data in the training and test sets can follow different probability distributions, leading ML systems toward poor generalisation performances. This problem is intensely felt in the Brain-Computer Interface (BCI) context, where bio-signals as Electroencephalographic (EEG) are often used. In fact, EEG signals are highly non-stationary both over time and between different subjects. To overcome this problem, several proposed solutions are based on recent transfer learning approaches such as Domain Adaption (DA). In several cases, however, the actual causes of the improvements remain ambiguous. This paper focuses on the impact of data normalisation, or standardisation strategies applied together with DA methods. In particular, using \textit{SEED}, \textit{DEAP}, and \textit{BCI Competition IV 2a} EEG datasets, we experimentally evaluated the impact of different normalization strategies applied with and without several well-known DA methods, comparing the obtained performances. It results that the choice of the normalisation strategy plays a key role on the classifier performances in DA scenarios, and interestingly, in several cases, the use of only an appropriate normalisation schema outperforms the DA technique.
翻译:在机器学习(ML)文献中,一个众所周知的问题是数据集转换问题,因为与ML标准假设不同的是,培训和测试组中的数据可以遵循不同的概率分布,导致ML系统一般性能差。这个问题在大脑-计算机界面(BCI)中强烈感受到,经常使用电子脑图(EEG)等生物信号。事实上,EEEG信号在时间和不同科目之间高度不固定。为了克服这一问题,若干拟议的解决办法是以最近转让学习方法为基础的,例如Domain Readion(DA)。然而,在几种情况中,改进的实际原因仍然模糊不清。本文侧重于数据正常化的影响,或与DA方法一起应用的标准化战略。特别是使用\textit{SEEDD},\textit{DEAP}和 textitle{BCI竞争(IV 2a} EEG数据集,我们实验地评估了应用的不同正常化战略的影响,与DA方法相比和不为人所知的方法的影响,比较了获得的正常性业绩,在DA的正常性战略中,其关键性业绩表现只表现了在DAchrefrical 中,在DAchilling 中,它只表现中,在一种令人感兴趣的例子。