State-of-the-art automatic augmentation methods (e.g., AutoAugment and RandAugment) for visual recognition tasks diversify training data using a large set of augmentation operations. The range of magnitudes of many augmentation operations (e.g., brightness and contrast) is continuous. Therefore, to make search computationally tractable, these methods use fixed and manually-defined magnitude ranges for each operation, which may lead to sub-optimal policies. To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations. RangeAugment uses an auxiliary loss based on image similarity as a measure to control the range of magnitudes of augmentation operations. As a result, RangeAugment has a single scalar parameter for search, image similarity, which we simply optimize via linear search. RangeAugment integrates seamlessly with any model and learns model- and task-specific augmentation policies. With extensive experiments on the ImageNet dataset across different networks, we show that RangeAugment achieves competitive performance to state-of-the-art automatic augmentation methods with 4-5 times fewer augmentation operations. Experimental results on semantic segmentation, object detection, foundation models, and knowledge distillation further shows RangeAugment's effectiveness.
翻译:在视觉识别任务方面,视觉识别任务所用的最先进的自动增强方法(例如自动增强和授标)使培训数据多样化,使用大量的增强作业。许多增强作业的规模范围(例如亮度和对比度)是连续的。因此,为了使搜索在计算上具有可移动性,这些方法为每个作业使用固定和人工定义的大小范围,这可能导致次优化政策。为了回答关于每个增强作业规模范围重要性的未决问题,我们引入了“范围增强”,使我们能够高效率地了解个人和复合增强作业的规模范围。“范围增强”使用基于图像相似性的辅助损失作为控制增强作业规模范围的尺度。结果,“范围扩大”有一个搜索、图像相似性的单一尺度参数,我们只是通过线性搜索优化这些参数。“范围增强”与任何模型紧密结合,并学习模型和任务特定的增强政策。通过对图像网络数据设置进行广泛的实验,我们显示,以图像网络相似性为基础,以图像为基础,以图像为基础,作为控制增强行动规模范围的尺度范围测量,以自动分析基础,并显示“增强”的升级性测试结果。