The effectiveness of Object Detection, one of the central problems in computer vision tasks, highly depends on the definition of the loss function - a measure of how accurately your ML model can predict the expected outcome. Conventional object detection loss functions depend on aggregation of metrics of bounding box regression such as the distance, overlap area and aspect ratio of the predicted and ground truth boxes (i.e. GIoU, CIoU, ICIoU etc). However, none of the methods proposed and used to date considers the direction of the mismatch between the desired ground box and the predicted, "experimental" box. This shortage results in slower and less effective convergence as the predicted box can "wander around" during the training process and eventually end up producing a worse model. In this paper a new loss function SIoU was suggested, where penalty metrics were redefined considering the angle of the vector between the desired regression. Applied to conventional Neural Networks and datasets it is shown that SIoU improves both the speed of training and the accuracy of the inference. The effectiveness of the proposed loss function was revealed in a number of simulations and tests.
翻译:物体探测是计算机视觉任务的核心问题之一,其有效性在很大程度上取决于损失功能的定义——这是衡量您ML模型能够预测预期结果的准确度的一个尺度。常规物体探测损失功能取决于捆绑盒回归的测量指标的汇总,如预测和地面真相框的距离、重叠面积和侧比(即GioU、CIOU、ICICIOU等)。然而,提议和迄今为止使用的方法都没有考虑理想地面框与预测的“实验”框之间的错配方向。这种短缺导致汇合速度慢,效率低,因为预测的框在培训过程中“绕过”并最终产生更差的模型。在本文中,提出了一个新的损失函数SIOU,在重新界定惩罚指标时考虑到了预期回归之间的矢量角度。对常规神经网络和数据集应用的方法表明,SioU提高了培训速度和推断的准确性。拟议的损失功能的有效性在一些模拟和试验中被揭示。