Human adaptability relies crucially on learning and merging knowledge from both supervised and unsupervised tasks: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the problem of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals on the same images. This secondary task helps the network to focus on object shapes, learning concepts like spatial orientation and part correlation, while acting as a regularizer for the classification task over multiple visual domains. Extensive experiments confirm our intuition and show that our multi-task method combining supervised and self-supervised knowledge shows competitive results with respect to more complex domain generalization and adaptation solutions. It also proves its potential in the novel and challenging predictive and partial domain adaptation scenarios.
翻译:人类适应能力的关键在于学习和整合来自受监督和不受监督的任务的知识:父母指出很少重要的概念,但随后儿童自己填补空白。这特别有效,因为受监督的学习永远不可能是详尽无遗的,因此自主学习能够发现有助于概括的不易和规律性。在本文中,我们提议对不同领域的物体识别问题采取类似的方法:我们的模型以受监督的方式学习语义标签,并通过学习同一图像上自监督的信号来扩大对数据的了解。这一次要任务帮助网络关注对象形状,学习空间方向和部分相关性等概念,同时在多个视觉领域的分类任务中起到常规作用。广泛的实验证实了我们的直觉,并表明我们将受监督和自我监督的知识结合起来的多任务方法在更复杂的域通用和适应解决方案方面显示出竞争性的结果。它还证明了它在新颖和具有挑战性的域域内适应情景中的潜力。