Mixup is a popular regularization technique for training deep neural networks that can improve generalization and increase adversarial robustness. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup to \emph{$k$-mixup} by perturbing $k$-batches of training points in the direction of other $k$-batches using displacement interpolation, interpolation under the Wasserstein metric. We demonstrate theoretically and in simulations that $k$-mixup preserves cluster and manifold structures, and we extend theory studying efficacy of standard mixup. Our empirical results show that training with $k$-mixup further improves generalization and robustness on benchmark datasets.
翻译:混合是培训深神经网络的一种流行的正规化技术,可以改进常规化,提高对抗性强健性。它干扰了培训数据输入到培训组中随机选择的其他案例的方向上。为了更好地利用数据结构,我们通过利用流离失所的内插、瓦塞斯坦标准下的内插和内插,将培训点的美元-瓦西斯坦其他k美元-kun-bitcher点方向上交叉化。我们在理论上和模拟中展示了美元混合保护集群和多重结构,我们扩展了标准混合的理论研究。我们的经验结果表明,使用美元混合的培训进一步提高了基准数据集的通用性和稳健性。