Automatic augmentation methods have recently become a crucial pillar for strong model performance in vision tasks. Current methods are mostly a trade-off between being simple, in-expensive or well-performing. We present a most simple automatic augmentation baseline, TrivialAugment, that outperforms previous methods almost for free. It is parameter-free and only applies a single augmentation to each image. To us, TrivialAugment's effectiveness is very unexpected. Thus, we performed very thorough experiments on its performance. First, we compare TrivialAugment to previous state-of-the-art methods in a plethora of scenarios. Then, we perform multiple ablation studies with different augmentation spaces, augmentation methods and setups to understand the crucial requirements for its performance. We condensate our learnings into recommendations to automatic augmentation users. Additionally, we provide a simple interface to use multiple automatic augmentation methods in any codebase, as well as, our full code base for reproducibility. Since our work reveals a stagnation in many parts of automatic augmentation research, we end with a short proposal of best practices for sustained future progress in automatic augmentation methods.
翻译:自动增强方法最近已成为在愿景任务中强有力模型性能的关键支柱。 目前的方法大多是简单、 昂贵或良好的两种方法之间的权衡。 我们展示了一个最简单的自动增强基线, 即三维加提法, 它几乎免费地优于以往的方法。 它没有参数, 仅对每个图像应用一个单一增强。 对我们来说, 三维加提法的有效性非常出乎意料。 因此, 我们对其性能进行了非常彻底的实验。 首先, 我们比较了三维加提法与以往在众多情况下最先进的方法之间的权衡。 然后, 我们用不同的增强空间、 增强方法和设置进行多重的调整研究, 以了解其性能的关键要求。 我们将学习内容集中到自动增强用户的建议中。 此外, 我们提供了一个简单的界面, 可以在任何代码库中使用多个自动增强方法, 以及我们用于再衡量的完整代码基础。 由于我们的工作显示自动增强研究的许多部分处于停滞状态, 我们最后提出一个有关自动增强方法未来持续进展的最佳做法的简短建议。