This paper focuses on domain generalization (DG), the task of learning from multiple source domains a model that generalizes well to unseen domains. A main challenge for DG is that the available source domains often exhibit limited diversity, hampering the model's ability to learn to generalize. We therefore employ a data generator to synthesize data from pseudo-novel domains to augment the source domains. This explicitly increases the diversity of available training domains and leads to a more generalizable model. To train the generator, we model the distribution divergence between source and synthesized pseudo-novel domains using optimal transport, and maximize the divergence. To ensure that semantics are preserved in the synthesized data, we further impose cycle-consistency and classification losses on the generator. Our method, L2A-OT (Learning to Augment by Optimal Transport) outperforms current state-of-the-art DG methods on four benchmark datasets.
翻译:本文侧重于从多个源域中学习的通用(DG)任务,即从多个源域中学习一个非常普通化到无形域的模型。DG面临的一个主要挑战是,现有源域往往表现出有限的多样性,妨碍了模型的普及能力。因此,我们使用数据生成器来合成伪新元域的数据,以扩大源域。这明确增加了现有培训域的多样性,并导致形成一个更通用的模式。为了培训生成器,我们用最佳运输方式模拟源域和合成伪新元域之间的分布差异,并尽量扩大差异。为了确保在综合数据中保留语义,我们进一步将周期一致性和分类损失强加给生成器。我们的方法,L2A-OT(通过优化运输获得增强作用)超越了四个基准数据集的当前最先进的DG方法。