Does progress on ImageNet transfer to real-world datasets? We investigate this question by evaluating ImageNet pre-trained models with varying accuracy (57% - 83%) on six practical image classification datasets. In particular, we study datasets collected with the goal of solving real-world tasks (e.g., classifying images from camera traps or satellites), as opposed to web-scraped benchmarks collected for comparing models. On multiple datasets, models with higher ImageNet accuracy do not consistently yield performance improvements. For certain tasks, interventions such as data augmentation improve performance even when architectures do not. We hope that future benchmarks will include more diverse datasets to encourage a more comprehensive approach to improving learning algorithms.
翻译:图像网络向真实世界数据集的传输是否取得进展? 我们通过在6个实用图像分类数据集方面对图像网络预先培训的模型进行不同准确度(57%-83%)的评估来调查这一问题。 特别是,我们研究收集的数据集的目的是解决真实世界任务(例如,从相机陷阱或卫星上对图像进行分类),而不是为比较模型而收集的网络剪切基准。 在多个数据集方面,图像网络精度较高的模型并不始终产生性能改进。 对于某些任务,例如数据增强等干预措施可以提高性能,即使结构不完善。 我们希望未来基准将包含更多样化的数据集,以鼓励以更全面的方法改进学习算法。