Transfer learning enables to re-use knowledge learned on a source task to help learning a target task. A simple form of transfer learning is common in current state-of-the-art computer vision models, i.e. pre-training a model for image classification on the ILSVRC dataset, and then fine-tune on any target task. However, previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains (consumer photos, autonomous driving, aerial imagery, underwater, indoor scenes, synthetic, close-ups) and task types (semantic segmentation, object detection, depth estimation, keypoint detection). Importantly, these are all complex, structured output tasks types relevant to modern computer vision applications. In total we carry out over 1200 transfer experiments, including many where the source and target come from different image domains, task types, or both. We systematically analyze these experiments to understand the impact of image domain, task type, and dataset size on transfer learning performance. Our study leads to several insights and concrete recommendations for practitioners.
翻译:转移学习能够重新利用在源任务上学到的知识,帮助学习目标任务。在目前最先进的计算机视觉模型中,常见的一种简单的转移学习形式,即先培训国际LSVRC数据集图像分类模型,然后对任何目标任务进行微调。然而,以前对转移学习的系统研究有限,预期工作环境不完全理解。在本文中,我们进行了广泛的实验性探索,以将学习转移到千差万别图像领域(消费者照片、自主驾驶、航空图像、水下、室内景象、合成、近身)和任务类型(静态分割、物体探测、深度估计、关键点探测),这些都是与现代计算机视觉应用有关的复杂、结构化的产出类型。我们总共进行了1200多次转移实验,包括许多来源和目标来自不同图像领域、任务类型或两者。我们系统地分析这些实验,以了解图像领域、任务类型和数据集大小对转移学习业绩的影响。我们的研究为从业人员提供了若干深刻的见解和具体建议。