Recent end-to-end multi-object detectors simplify the inference pipeline by removing the hand-crafted process such as the duplicate bounding box removal using non-maximum suppression (NMS). However, in the training, they require bipartite matching to calculate the loss from the output of the detector. Contrary to the directivity of the end-to-end method, the bipartite matching makes the training of the end-to-end detector complex, heuristic, and reliant. In this paper, we aim to propose a method to train the end-to-end multi-object detector without bipartite matching. To this end, we approach end-to-end multi-object detection as a density estimation using a mixture model. Our proposed detector, called Sparse Mixture Density Object Detector (Sparse MDOD) estimates the distribution of bounding boxes using a mixture model. Sparse MDOD is trained by minimizing the negative log-likelihood and our proposed regularization term, maximum component maximization (MCM) loss that prevents duplicated predictions. During training, no additional procedure such as bipartite matching is needed, and the loss is directly computed from the network outputs. Moreover, our Sparse MDOD outperforms the existing detectors on MS-COCO, a renowned multi-object detection benchmark.
翻译:最近的端到端多球探测器通过去除手制过程,例如使用非最大抑制(NMS)来重复捆绑盒清除,简化导火线。 但是,在培训中,它们需要双向匹配来计算探测器输出的损失。 与端到端方法的直接性相反, 双向匹配使端到端检测器复杂、 重力和依赖性的培训成为端到端检测器的培训。 在本文件中, 我们的目标是提出一种方法, 用于培训端到端多球探测器, 但没有双向匹配。 为此, 我们使用混合模型, 将端到端多球探测器作为密度估计。 我们提议的探测器, 叫做Sprass Mixtur Denstor(Sparse MDOD), 与端到端检测器的分布相反, 双向MDDDDD是经过培训的, 最大限度地减少负日志相似性和我们提议的正规化术语, 最大部分损失, 防止重复预测。 在培训中, 我们的端到端多球探测器作为密度估计的密度估计器, 。 在测试中, 普通的双向式 标准的检测中,, 需要 标准 双向 。