Several modern deep single-stage object detectors are really effective for real time processing but still remain less efficient than more complex ones. The trade-off between model performances and computing speed is an important challenge, directly related to the learning process. In this paper, we propose a new way to efficiently learn a single shot detector providing a very good trade-off between these two factors. For this purpose, we introduce LapNet, an anchor based detector, trained end-to-end without any sampling strategy. Our approach focuses on two limitations of anchor based detector training: (1) the ambiguity of anchor to ground truth assignment and (2) the imbalance between classes and the imbalance between object sizes. More specifically, a new method to assign positive and negative anchors is proposed, based on a new overlapping function called "Per-Object Normalized Overlap" (PONO). This more flexible assignment can be self-corrected by the network itself to avoid the ambiguity between close objects. In the learning process, we also propose to automatically learn weights to balance classes and object sizes to efficiently manage sample imbalance. It allows to build a robust object detector avoiding multi-scale prediction, in a semantic segmentation spirit.
翻译:若干现代的、深的单阶段物体探测器对于实时处理来说确实有效,但效率仍然低于更复杂的探测器。模型性能和计算速度之间的权衡是一项重要挑战,与学习过程直接相关。在本文件中,我们提出了一种新方法,以便有效地学习一个单一的射击探测器,在这两种因素之间有一个非常好的权衡。为此,我们引入了以锚为基础的定位探测器LapNet,即一个定位探测器,培训端对端,而没有任何取样战略。我们的方法侧重于基于定位探测器培训的两个限制:(1)锚对地面真象定位的模糊性,(2)等级之间的不平衡和对象大小之间的不平衡性。更具体地说,我们建议采用一种新的方法,在称为“Per-Object 常态重叠”(PONO)的新的重叠函数基础上,指定一个正向和负向锚。这一更灵活的任务可以由网络本身自行校正,以避免近物体之间的模糊性。在学习过程中,我们还提议自动学习平衡级和对象大小的重量,以便有效地管理抽样不平衡。它能够建立一个强大的物体探测器,避免多尺度的地震段段。