Recent end-to-end multi-object detectors simplify the inference pipeline by removing the hand-crafted process such as the duplicate bounding box removal using non-maximum suppression (NMS). However, in the training, they require bipartite matching to calculate the loss from the output of the detector. Contrary to the directivity, which is at the heart of end-to-end learning, the bipartite matching makes the training of the end-to-end detector complex, heuristic, and reliant. In this paper, we propose a method to train an end-to-end multi-object detector without bipartite matching. To this end, we approach end-to-end multi-object detection as a density estimation problem using a mixture model. Our proposed detector, called Sparse Mixture Density Object Detector (Sparse MDOD), estimates the distribution of bounding boxes using a mixture model. Sparse MDOD is trained by minimizing the negative log-likelihood and our proposed regularization term, maximum component maximization (MCM) loss that prevents duplicated predictions. During training, no additional procedure such as bipartite matching is needed, and the loss is directly computed from the network outputs. Moreover, our Sparse MDOD outperforms the existing detectors on MS-COCO, a renowned multi-object detection benchmark.
翻译:最近的端到端多球探测器通过去除手动工艺,例如使用非最大抑制(NMS)来重复捆绑盒清除器等,简化导火线。 但是,在培训中,它们需要双向匹配来计算探测器输出的损失。 与直接性相反,这是端到端学习的核心, 双向匹配使得培训端到端检测器复杂、 超度和依赖性。 在本文中, 我们提议了一种方法, 用于培训一个端到端多球探测器, 没有双向匹配。 为此, 我们使用混合模型, 将端到端多球探测器作为密度估计问题。 我们提议的探测器, 叫做“ 分解分解分解分立器” (Sparse MDDD), 用混合模型来估计捆绑箱的分布情况。 微调MDDDDD经过培训, 最大限度地减少负日志和我们提议的正规化期, 最大组成部分(MCMM) 损失, 防止重复的重复预测。 在当前的IMDMD IM 测试中,, 没有额外的程序, 直接匹配现有的双级测试。