We propose a new efficient way to sample from a Variational Autoencoder in the challenging low sample size setting. This method reveals particularly well suited to perform data augmentation in such a low data regime and is validated across various standard and real-life data sets. In particular, this scheme allows to greatly improve classification results on the OASIS database where balanced accuracy jumps from 80.7% for a classifier trained with the raw data to 88.6% when trained only with the synthetic data generated by our method. Such results were also observed on 3 standard data sets and with other classifiers. A code is available at https://github.com/clementchadebec/Data_Augmentation_with_VAE-DALI.
翻译:我们提出了在具有挑战性的低抽样规模设置中从变化式自动编码器取样的新的有效方法,这种方法表明特别适合在这种低数据系统中进行数据扩充,并经过各种标准和实际数据组的验证,特别是,这一方法能够大大改进OASIS数据库的分类结果,在数据库中,受过原始数据培训的分类器的平衡精度从80.7%提高到88.6%,但只有经过通过我们的方法产生的合成数据的培训后才能达到88.6%。在3个标准数据集和其他分类器中也观察到了这种结果。一个代码可在https://github.com/clementchadebec/Data_Augmentation_ with_VAE-DALI查阅。