This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. To develop this family of models, we use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency. The models were searched from the search space enriched with new ops such as Fused-MBConv. Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller. Our training can be further sped up by progressively increasing the image size during training, but it often causes a drop in accuracy. To compensate for this accuracy drop, we propose to adaptively adjust regularization (e.g., dropout and data augmentation) as well, such that we can achieve both fast training and good accuracy. With progressive learning, our EfficientNetV2 significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets. By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87.3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2.0% accuracy while training 5x-11x faster using the same computing resources. Code will be available at https://github.com/google/automl/tree/master/efficientnetv2.
翻译:本文引入了高效NetV2, 这是一种新型的革命网络, 其培训速度比以前的模型更快, 参数效率也比以前的模型要高。 为了开发这一模型, 我们使用组合式的培训- 有觉神经结构搜索和缩放, 共同优化培训速度和参数效率。 这些模型是从搜索空间搜索的, 使用Fuse- MBConv等新操作丰富了这些模型。 我们的实验显示, 高效NetV2 模型比最先进的模型培训速度快得多, 同时要小到6. 880x 。 我们的高效网络2 可以通过在培训期间逐步提高图像规模而进一步加快速度, 但它往往导致准确性下降。 为了弥补这种精确性下降, 我们建议通过适应性调整规范( 如, 退出和数据增强), 来实现快速培训和准确性。 随着不断学习, 我们的高效Net2 和 CIFAR/ Cars/ Flowers 数据集比以前的模型快得多。 通过在相同的图像Net21k上的培训, 我们的高效的网络2 将达到87.3% 头1 准确性在图像Net IMULSVRC2012 上, 快速地将显示最新的 5- 正在运行 的精确性 。