Data augmentation is a highly effective approach for improving performance in deep neural networks. The standard view is that it creates an enlarged dataset by adding synthetic data, which raises a problem when combining it with Bayesian inference: how much data are we really conditioning on? This question is particularly relevant to recent observations linking data augmentation to the cold posterior effect. We investigate various principled ways of finding a log-likelihood for augmented datasets. Our approach prescribes augmenting the same underlying image multiple times, both at test and train-time, and averaging either the logits or the predictive probabilities. Empirically, we observe the best performance with averaging probabilities. While there are interactions with the cold posterior effect, neither averaging logits or averaging probabilities eliminates it.
翻译:数据增强是提高深神经网络性能的一个非常有效的方法。 标准的观点是,它通过添加合成数据而创建了扩大的数据集,这在将其与巴伊西亚推论相结合时产生了一个问题:我们真正要依赖的数据是多少? 这个问题与最近观测将数据增强与冷后遗效应联系起来特别相关。 我们调查了寻找增强数据集的日志相似性的各种原则方法。 我们的方法规定在测试和培训时间多次提升相同的基本图像,并平均对数或预测概率。 偶然地,我们观察的是平均概率的最佳性能。 虽然与冷后遗效应存在互动,但平均日志或平均概率都没有消除它。