Domain Generalization (DG) studies the capability of a deep learning model to generalize to out-of-training distributions. In the last decade, literature has been massively filled with a collection of training methodologies that claim to obtain more abstract and robust data representations to tackle domain shifts. Recent research has provided a reproducible benchmark for DG, pointing out the effectiveness of naive empirical risk minimization (ERM) over existing algorithms. Nevertheless, researchers persist in using the same outdated feature extractors, and no attention has been given to the effects of different backbones yet. In this paper, we start back to backbones proposing a comprehensive analysis of their intrinsic generalization capabilities, so far ignored by the research community. We evaluate a wide variety of feature extractors, from standard residual solutions to transformer-based architectures, finding an evident linear correlation between large-scale single-domain classification accuracy and DG capability. Our extensive experimentation shows that by adopting competitive backbones in conjunction with effective data augmentation, plain ERM outperforms recent DG solutions and achieves state-of-the-art accuracy. Moreover, our additional qualitative studies reveal that novel backbones give more similar representations to same-class samples, separating different domains in the feature space. This boost in generalization capabilities leaves marginal room for DG algorithms and suggests a new paradigm for investigating the problem, placing backbones in the spotlight and encouraging the development of consistent algorithms on top of them.
翻译:广域化(DG) 研究深度学习模式的能力,以推广培训外分布。在过去十年中,文献大量填满了一套培训方法,这些培训方法声称要获得更抽象、更可靠的数据代表,以应对域变。最近的研究为DG提供了一个可复制的基准,指出了将天真的风险降到最低的程度相对于现有算法的有效性。然而,研究人员坚持使用同样过时的特征提取器,尚未注意不同主干网的影响。在本文中,我们从主干网开始全面分析其内在的概括化能力,研究界迄今忽视了这些主干网。我们评估了从标准残余解决方案到变异结构的多种特征提取器,找出了大规模单体分类准确性和DG能力之间的明显线性相关性。我们的广泛实验表明,通过采用竞争性的骨干与有效的数据扩增,简单的机构风险管理超越了最近的DG解决方案,并实现了最先进的准确性。此外,我们的额外定性研究显示,新的主干网骨架骨架骨架提供了更相近的模型,在将空间的底基结构提升到相同的模型上,从而将空间的模型引入了相同的模型。