Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the model. To address this, for the augmented samples from the same training example, we propose to assign different weights to them. We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples. Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i.e., harder examples). Minimizing this maximal expected loss enables the model to perform well under any reweighting strategy. The proposed method can generally be applied on top of any data augmentation methods. Experiments are conducted on both natural language understanding tasks with token-level data augmentation, and image classification tasks with commonly-used image augmentation techniques like random crop and horizontal flip. Empirical results show that the proposed method improves the generalization performance of the model.
翻译:增强数据是改进深神经网络普遍化的有效技术。 但是, 先前的数据增强方法通常在不考虑扩大样本对模型的个别影响的情况下对扩大样本一视同仁。 为了解决这个问题, 对于来自同一培训示例的扩大样本, 我们提议给它们分配不同的权重。 我们构建了最大预期损失, 即相对于增加样本中任何重估损失的最大期望损失。 在对抗性培训的启发下, 我们最大限度地减少这一最大预期损失, 并获得简单和可解释的封闭式解决方案: 应当更加注意增加具有巨大损失值的样本( 更难的例子 ) 。 最大限度地减少这一最大预期损失可使模型在任何再加权战略下运行良好。 提议的方法一般可以在任何数据增强方法的顶部应用。 实验既针对自然语言理解任务, 也针对符号级数据增强, 和图像分类任务, 通常使用的图像增强技术, 如随机作物和水平翻转。 精度结果显示, 拟议的方法可以改进模型的一般性能 。