Few-shot object detection has rapidly progressed owing to the success of meta-learning strategies. However, the requirement of a fine-tuning stage in existing methods is timeconsuming and significantly hinders their usage in real-time applications such as autonomous exploration of low-power robots. To solve this problem, we present a brand new architecture, AirDet, which is free of fine-tuning by learning class agnostic relation with support images. Specifically, we propose a support-guided cross-scale (SCS) feature fusion network to generate object proposals, a global-local relation network (GLR) for shots aggregation, and a relation-based prototype embedding network (R-PEN) for precise localization. Exhaustive experiments are conducted on COCO and PASCAL VOC datasets, where surprisingly, AirDet achieves comparable or even better results than the exhaustively finetuned methods, reaching up to 40-60% improvements on the baseline. To our excitement, AirDet obtains favorable performance on multi-scale objects, especially the small ones. Furthermore, we present evaluation results on real-world exploration tests from the DARPA Subterranean Challenge, which strongly validate the feasibility of AirDet in robotics. The source code, pre-trained models, along with the real world data for exploration, will be made public.
翻译:由于元化学习战略的成功,微小物体探测工作进展迅速,但由于元化学习战略的成功,目前方法的微调阶段要求是耗时的,严重妨碍这些微调阶段用于实时应用,例如自主探索低功率机器人。为了解决这个问题,我们提出了一个全新的品牌结构,AirDet,它没有通过学习与支持图像的课堂不可知关系进行微调。具体地说,我们提议建立一个支持性引导的跨规模特征聚合网络,以产生物体提议,一个全球-地方关系网络(GLR)来收集镜头,以及一个基于关系的原型嵌入网络(R-PEN)来精确本地化。对COCO和PASAL VOC数据集进行了超速实验,令人惊讶的是,AirDet取得了比彻底调整的方法更相似或更好的结果,在基线上达到40-60%的改进。为了我们的兴奋,AirDet在多尺度物体特别是小型物体上获得了优异性性性表现。此外,我们介绍了DARPA的实时实验结果,将严格地验证世界的机器人探索数据源。