Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in adversarial effects. In this paper, we introduce Local Mixup in which distant input samples are weighted down when computing the loss. In constrained settings we demonstrate that Local Mixup can create a trade-off between bias and variance, with the extreme cases reducing to vanilla training and classical Mixup. Using standardized computer vision benchmarks , we also show that Local Mixup can improve test accuracy.
翻译:混合是一种依赖数据的正规化技术,包括线性内插输入样本和相关产出。在用于标准机器学习数据集培训时,事实证明它提高了准确性。然而,作者指出,混合可以产生分配外虚拟样本,甚至强化培训组合中的矛盾,可能导致对抗效应。在本文中,我们引入了本地混合,在计算损失时对远方输入样本进行加权。在受限制的环境下,我们证明本地混合可以在偏差和差异之间产生权衡,极端情况将减少为香草培训和经典混合。使用标准化的计算机视觉基准,我们还表明本地混合可以提高测试准确性。