Deep neural networks are susceptible to shortcut learning, using simple features to achieve low training loss without discovering essential semantic structure. Contrary to prior belief, we show that generative models alone are not sufficient to prevent shortcut learning, despite an incentive to recover a more comprehensive representation of the data than discriminative approaches. However, we observe that shortcuts are preferentially encoded with minimal information, a fact that generative models can exploit to mitigate shortcut learning. In particular, we propose Chroma-VAE, a two-pronged approach where a VAE classifier is initially trained to isolate the shortcut in a small latent subspace, allowing a secondary classifier to be trained on the complementary, shortcut-free latent subspace. In addition to demonstrating the efficacy of Chroma-VAE on benchmark and real-world shortcut learning tasks, our work highlights the potential for manipulating the latent space of generative classifiers to isolate or interpret specific correlations.
翻译:深神经网络很容易被捷径学习,使用简单特征实现低培训损失,而不发现基本的语义结构。 与先前的信念相反,我们表明,只靠基因模型不足以防止捷径学习,尽管有动力恢复数据更全面的表述而不是歧视性方法;然而,我们观察到,捷径的编码优于最起码的信息,而基因模型可以用来减轻捷径学习。特别是,我们提议Chroma-VAE,这是一个双管齐下的方法,即VAE分类器最初经过训练,可以在小型潜伏子空间中隔离捷径,使二级分类器能够接受补充性、无捷径的潜在亚空间的培训。除了展示Chroma-VAE在基准和实际世界捷径学习任务上的功效外,我们的工作还突出了调动基因分类器的潜在空间以孤立或解释具体关联的潜力。