We present three large-scale experiments on binary text matching classification task both in Chinese and English to evaluate the effectiveness and generalizability of random text perturbations as a data augmentation approach for NLP. It is found that the augmentation can bring both negative and positive effects to the test set performance of three neural classification models, depending on whether the models train on enough original training examples. This remains true no matter whether five random text editing operations, used to augment text, are applied together or separately. Our study demonstrates with strong implication that the effectiveness of random text perturbations is task specific and not generally positive.
翻译:我们用中文和英文对匹配分类的二进制文本进行了三次大规模实验,以评价随机文本扰动作为NLP数据增强方法的有效性和可概括性。我们发现,扩增可给三个神经分类模型的测试集性能带来消极和积极影响,这取决于模型是否以足够的原始培训实例进行训练。无论用于增强文本的五个随机文本编辑操作是同时还是单独应用,这仍然是真实的。我们的研究显示,随机文本扰动的效果是特定任务,一般而言不是正面任务。