While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world. To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e. open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e. incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e. global clustering, forces the network to map samples closer to the class centroid they belong to while the second one, local clustering, shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on RGB-D Object and Core50 datasets show the effectiveness of our approach.
翻译:虽然连锁神经网络给机器人的视觉带来了重大进步,但其能力往往局限于封闭的世界情景,即需要承认的语义概念的数量由现有培训组合决定。由于几乎不可能在单一的培训组合中捕捉真实世界中存在的所有可能的语义概念,我们需要打破封闭世界的假设,为我们的机器人提供在开放世界中采取行动的能力。为了提供这种能力,机器人的视觉系统应当能够(一) 确定一个实例是否不属于已知的类别(即开放的识别),以及(二) 扩展其知识,以便随着时间的推移学习新课程(即逐步学习)。在这项工作中,我们展示了我们如何通过新的损失配置来提高深入开放世界认知算法的性能,将一个全球性的、针对特定类别特点的组合到本地组合。特别是,第一个损失术语,即全球组合,迫使网络绘制接近其所属类类的样本,而第二个目标,即本地组合,将代表空间定位成一种方式,以便用一种方式在时间上学习新层次的标本,而我们更接近于某个类的实验级的极限,同时,将展示一个以前的空间的标本,将展示一个比更接近于一个不同的空间的模型,从而学习到另一个空间的排序的排序的缩。