We demonstrate that many detection methods are designed to identify only a sufficently accurate bounding box, rather than the best available one. To address this issue we propose a simple and fast modification to the existing methods called Fitness NMS. This method is tested with the DeNet model and obtains a significantly improved MAP at greater localization accuracies without a loss in evaluation rate, and can be used in conjunction with Soft NMS for additional improvements. Next we derive a novel bounding box regression loss based on a set of IoU upper bounds that better matches the goal of IoU maximization while still providing good convergence properties. Following these novelties we investigate RoI clustering schemes for improving evaluation rates for the DeNet wide model variants and provide an analysis of localization performance at various input image dimensions. We obtain a MAP of 33.6%@79Hz and 41.8%@5Hz for MSCOCO and a Titan X (Maxwell). Source code available from: https://github.com/lachlants/denet
翻译:我们证明,许多探测方法的设计目的只是确定一个充分准确的捆绑框,而不是现有最佳的捆绑框。为了解决这一问题,我们建议对称为“Fitness NMS”的现有方法进行简单和快速的修改。这个方法由DeNet模型测试,在更本地化的缩略图中获得显著改进的MAP,而不会在评价率方面造成损失,并且可以与Soft NMS一起用于进一步的改进。接下来,我们从一组IOU上层框中得出一个新的捆绑框回归损失,该框框的顶框线与IoU最大化的目标更匹配,同时仍然提供良好的趋同特性。根据这些新颖之处,我们调查RoI集群计划,以提高DeNet宽广模型变量的评价率,并分析各种输入图像层面的本地化性能。我们得到了一个33.6 ⁇ 79Hz和41.85Hz用于MCCO和Titan X(Maxwell)的MAPAP。源代码来自:https://github.com/lachlants/denet。