Domain Generalization (DG) is perceived as a front face of OOD generalization. We present empirical evidence to show that the primary reason for generalization in DG is the presence of multiple domains while training. Furthermore, we show that methods for generalization in IID are equally important for generalization in DG. Tailored methods fail to add performance gains in the Traditional DG (TDG) evaluation. Our experiments prompt if TDG has outlived its usefulness in evaluating OOD generalization? To further strengthen our investigation, we propose a novel evaluation strategy, ClassWise DG (CWDG), where for each class, we randomly select one of the domains and keep it aside for testing. We argue that this benchmarking is closer to human learning and relevant in real-world scenarios. Counter-intuitively, despite being exposed to all domains during training, CWDG is more challenging than TDG evaluation. While explaining the observations, our work makes a case for more fundamental analysis around the DG problem before exploring new ideas to tackle it.
翻译:一般化(DG)被视为OOD一般化的前面。 我们提出经验证据,以证明DG普遍化的主要原因是在培训过程中存在多个领域。 此外,我们还表明,ID中的一般化方法对DG的概括化同样重要。 定制方法未能在传统DG(TDG)评估中增加业绩收益。 如果TDG在评价OOD一般化方面已经失去效用,我们的实验就会迅速进行?为了进一步加强我们的调查,我们建议了一个新的评价战略,AleWise DG(CWDG),我们随机选择了每个类别中的一个领域,将其留待试验。我们说,这种基准比较接近人类学习,在现实世界情景中也具有相关性。尽管在培训期间接触所有领域,但CWDG比TDG评估更具挑战性。我们的工作在解释这些观察的同时,在探索解决该问题的新想法之前,有理由对DG问题进行更根本的分析。