Object detection is a fundamental vision task. It has been highly researched in academia and has been widely adopted in industry. Average Precision (AP) is the standard score for evaluating object detectors. Our understanding of the subtleties of this score, however, is limited. Here, we quantify the sensitivity of AP to bounding box perturbations and show that AP is very sensitive to small translations. Only one pixel shift is enough to drop the mAP of a model by 8.4%. The mAP drop over small objects with only one pixel shift is 23.1%. The corresponding numbers when ground-truth (GT) boxes are used as predictions are 23% and 41.7%, respectively. These results explain why achieving higher mAP becomes increasingly harder as models get better. We also investigate the effect of box scaling on AP. Code and data is available at https://github.com/aliborji/AP_Box_Perturbation.
翻译:目标探测是一项基本的视觉任务。 它在学术界进行了深入的研究,并在行业中被广泛采用。 平均精度( AP) 是评价对象探测器的标准分数。 但是,我们对这个分数的微妙性了解有限。 在这里, 我们量化了 AP 的灵敏度, 以约束框扰动, 并显示 AP 对小翻译非常敏感 。 只有一个像素转换足以将模型的MAP 降低8. 4% 。 mAP 向小对象投递, 只有一个像素转换, 比例为23.1 % 。 使用地像框作为预测的对应数字分别是 23% 和 41.7% 。 这些结果解释了为什么随着模型的改善, 实现更高 mAP 的灵敏度越来越难。 我们还研究了 AP 的框缩放效果。 代码和数据可在 https://github.com/ aliborji/ AP_Box_ Perturbation 上查阅 。