Robotic visual systems operating in the wild must act in unconstrained scenarios, under different environmental conditions while facing a variety of semantic concepts, including unknown ones. To this end, recent works tried to empower visual object recognition methods with the capability to i) detect unseen concepts and ii) extended their knowledge over time, as images of new semantic classes arrive. This setting, called Open World Recognition (OWR), has the goal to produce systems capable of breaking the semantic limits present in the initial training set. However, this training set imposes to the system not only its own semantic limits, but also environmental ones, due to its bias toward certain acquisition conditions that do not necessarily reflect the high variability of the real-world. This discrepancy between training and test distribution is called domain-shift. This work investigates whether OWR algorithms are effective under domain-shift, presenting the first benchmark setup for assessing fairly the performances of OWR algorithms, with and without domain-shift. We then use this benchmark to conduct analyses in various scenarios, showing how existing OWR algorithms indeed suffer a severe performance degradation when train and test distributions differ. Our analysis shows that this degradation is only slightly mitigated by coupling OWR with domain generalization techniques, indicating that the mere plug-and-play of existing algorithms is not enough to recognize new and unknown categories in unseen domains. Our results clearly point toward open issues and future research directions, that need to be investigated for building robot visual systems able to function reliably under these challenging yet very real conditions. Code available at https://github.com/DarioFontanel/OWR-VisualDomains
翻译:野外操作的机器人直观系统必须在不受限制的情况下,在不同的环境条件下采取行动,同时面临各种语义概念,包括未知概念。为此,最近的一些工程试图赋予视觉物体识别方法以权力,使其具备以下能力:一)探测未知概念,二)随着新语义类的图像的到来,随着时间的推移,扩大它们的知识范围。这个称为开放世界识别(OWR)的设置的目标是建立能够打破初始培训设置中存在的语义限制的系统。然而,这一培训设置不仅给系统规定了自己的语义限制,而且给系统规定了环境限制,这是因为它偏向于某些不一定反映真实世界高度变化的获取条件。这个工程试图赋予视觉物体识别方法,使之具备以下功能:一) 显示培训和测试分布的高度差异;二) 培训和测试分布之间的这种差异被称为域变异异。这项工作调查在域间是否有效,提出了公平评估 OWRR算法的运行情况的第一个基准设置,而没有进行域变换。然后,我们用这个基准对各种情景进行分析,表明在培训和测试中,OWR值分布期间,现有的数学算法是如何遭受严重的性退化的。 我们的分析显示,现在的域域域域域域的变变数只是微的变变数, 认识的变数,只有微的变数,只有微的变数的变数的变数,我们现有的变数,只是的变数的变数的变数,我们的变的变的变数,我们的变数,我们的变的变的变数在一般的变法函数的变数,我们的变的变数只是微的变数,我们的变数的变数的变数,我们的变数,我们的变的变的变的变的变数的变的变的变数只是微的变数只是的变的变的变的变的变的变的变的变的变的变的变数只是的变的变数只是的变的变数,只的变数法。的变的变数。