We demonstrate that many detection methods are designed to identify only a sufficently accurate bounding box, rather than the best available one. To address this issue we propose a simple and fast modification to the existing methods called Fitness NMS. This method is tested with the DeNet model and obtains a significantly improved MAP at greater localization accuracies without a loss in evaluation rate. Next we derive a novel bounding box regression loss based on a set of IoU upper bounds that better matches the goal of IoU maximization while still providing good convergence properties. Following these novelties we investigate RoI clustering schemes for improving evaluation rates for the DeNet \textit{wide} model variants and provide an analysis of localization performance at various input image dimensions. We obtain a MAP[0.5:0.95] of 33.6\%@79Hz and 41.8\%@5Hz for MSCOCO and a Titan X (Maxwell).
翻译:我们证明,许多探测方法的设计目的只是确定一个充分准确的捆绑框,而不是现有的最佳捆绑框。为了解决这个问题,我们建议对称为“Fitness NMS”的现有方法进行简单和快速的修改。这个方法通过DeNet模型测试,在更本地化的缩放中获得显著改进的MAP,而不会在评价率方面造成损失。接下来,我们根据一套IoU的上限,得出一个新的捆绑框回归损失,该套框的上限更符合IoU最大化的目标,同时仍然提供良好的趋同特性。根据这些创新,我们调查RoI集群计划,以提高DeNet\textit{全局模式变量的评价率,并分析各种输入图像层面的本地化性能。我们获得了一个[0.5:0.95]33.6 ⁇ 79Hz和41.8 ⁇ 5Hz的MAPAP,用于MSCO和Titan X(Maxwell)。