According to recent studies, commonly used computer vision datasets contain about 4% of label errors. For example, the COCO dataset is known for its high level of noise in data labels, which limits its use for training robust neural deep architectures in a real-world scenario. To model such a noise, in this paper we have proposed the homoscedastic aleatoric uncertainty estimation, and present a series of novel loss functions to address the problem of image object detection at scale. Specifically, the proposed functions are based on Bayesian inference and we have incorporated them into the common community-adopted object detection deep learning architecture RetinaNet. We have also shown that modeling of homoscedastic aleatoric uncertainty using our novel functions allows to increase the model interpretability and to improve the object detection performance being evaluated on the COCO dataset.
翻译:根据最近的研究,通常使用的计算机视觉数据集包含大约4%的标签错误。例如,COCO数据集以数据标签中的高噪音而著称,这限制了它用于在现实世界情景中培训坚固的神经深层结构。为了模拟这种噪音,我们在本文中建议采用同质感偏执性不确定性估计,并提出一系列新的损失功能,以解决大规模图像物体探测问题。具体地说,拟议的功能以Bayesian推理为基础,我们将这些功能纳入了共同社区采用的物体探测深层学习结构RetinaNet。我们还表明,利用我们的新功能模拟同质性神经深层不确定性,可以提高模型的可解释性,并改进在COCO数据集上评估的物体探测性能。