Latest deep learning methods for object detection provide remarkable performance, but have limits when used in robotic applications. One of the most relevant issues is the long training time, which is due to the large size and imbalance of the associated training sets, characterized by few positive and a large number of negative examples (i.e. background). Proposed approaches are based on end-to-end learning by back-propagation [22] or kernel methods trained with Hard Negatives Mining on top of deep features [8]. These solutions are effective, but prohibitively slow for on-line applications. In this paper we propose a novel pipeline for object detection that overcomes this problem and provides comparable performance, with a 60x training speedup. Our pipeline combines (i) the Region Proposal Network and the deep feature extractor from [22] to efficiently select candidate RoIs and encode them into powerful representations, with (ii) the FALKON [23] algorithm, a novel kernel-based method that allows fast training on large scale problems (millions of points). We address the size and imbalance of training data by exploiting the stochastic subsampling intrinsic into the method and a novel, fast, bootstrapping approach. We assess the effectiveness of the approach on a standard Computer Vision dataset (PASCAL VOC 2007 [5]) and demonstrate its applicability to a real robotic scenario with the iCubWorld Transformations [18] dataset.
翻译:最新深入的物体探测学习方法具有非凡的性能,但在机器人应用中却有局限性。最相关的问题之一是培训时间长,因为相关培训组合规模大且不平衡,其特点是数量少,数量少,数量多(即背景)。拟议方法的基础是通过背面分析[22]或内核方法进行端对端学习[22]或内核方法,在深层特征上经过硬阴性采矿培训[8]。这些解决方案是有效的,但对于在线应用来说却过于缓慢。在本文中,我们提出了一个新的物体探测管道,以克服这一问题并提供可比的性能,并提供了60x培训速度。我们的管道将(一) 区域建议网络和[22] 的深度特征提取器结合起来,以便高效率地选择候选机器人并将其编码成强有力的表达方式,以及(二) FALKON [23] 算法,这是一种新型的内核计算法,可以快速地培训大规模问题(百万分点)。我们在2007年将培训数据结构中利用精准的亚基模,并快速地评估了2007年标准数字系统,从而评估了2007年世界愿景系统(VC)的内空基数据。