We show a simple NMS-free, end-to-end object detection framework, of which the network is a minimal modification to a one-stage object detector such as the FCOS detection model [Tian et al. 2019]. We attain on par or even improved detection accuracy compared with the original one-stage detector. It performs detection at almost the same inference speed, while being even simpler in that now the post-processing NMS (non-maximum suppression) is eliminated during inference. If the network is capable of identifying only one positive sample for prediction for each ground-truth object instance in an image, then NMS would become unnecessary. This is made possible by attaching a compact PSS head for automatic selection of the single positive sample for each instance (see Fig. 1). As the learning objective involves both one-to-many and one-to-one label assignments, there is a conflict in the labels of some training examples, making the learning challenging. We show that by employing a stop-gradient operation, we can successfully tackle this issue and train the detector. On the COCO dataset, our simple design achieves superior performance compared to both the FCOS baseline detector with NMS post-processing and the recent end-to-end NMS-free detectors. Our extensive ablation studies justify the rationale of the design choices.
翻译:我们展示了一个简单的无NMS、端到端天体探测框架,其中网络只是对FCOS探测模型(Tian等人,2019年)一类一阶段物体探测器的微小修改,与原一阶段探测器相比,我们基本或甚至提高了探测精确度。与原一阶段探测器相比,我们以几乎相同的推论速度进行探测,而现在则更为简单,因为后处理NMS(非最大抑制)的标签在推断过程中被消除了。如果网络能够只确定一个正面样本,用于在图像中预测每个地面真相对象实例,那么NMS就变得没有必要了。这可以通过在自动选择每个实例的单一正面样品时附上一个压缩的PSS头来做到(见Fig.1)。由于学习目标涉及一对一和一对一的标签任务,因此一些培训范例的标签存在冲突,使得学习具有挑战性。我们通过采用停止级操作,我们就能成功地解决这个问题并训练探测器。在COCO数据集上,我们简单的设计能实现高端的运行率,比我们的FMS设计基底比我们的FMS测试。