We introduce Noisy Feature Mixup (NFM), an inexpensive yet effective method for data augmentation that combines the best of interpolation based training and noise injection schemes. Rather than training with convex combinations of pairs of examples and their labels, we use noise-perturbed convex combinations of pairs of data points in both input and feature space. This method includes mixup and manifold mixup as special cases, but it has additional advantages, including better smoothing of decision boundaries and enabling improved model robustness. We provide theory to understand this as well as the implicit regularization effects of NFM. Our theory is supported by empirical results, demonstrating the advantage of NFM, as compared to mixup and manifold mixup. We show that residual networks and vision transformers trained with NFM have favorable trade-offs between predictive accuracy on clean data and robustness with respect to various types of data perturbation across a range of computer vision benchmark datasets.
翻译:我们引入了Nisy地物混合(NFM),这是一个廉价而有效的数据增强方法,结合了最佳的内插培训和噪音注射计划。我们不是用一对实例和标签的组合进行训练,而是在输入空间和特征空间使用一对数据点的噪声隔热的组合。这种方法包括混杂和多重混合,作为特殊案例,但具有额外的优势,包括更好地平滑决定界限和使模型更加稳健。我们提供了理论来理解这一点以及NFM隐含的正规化效果。我们的理论得到了经验性结果的支持,证明了NFM的优势,与混合和多重混和混合相比。我们表明,经过NFM培训的残余网络和视觉变异器在清洁数据的预测准确性和各种类型数据在一系列计算机视觉基准数据集中渗透的稳健性之间有着有利的权衡。