Convolutional Neural Networks (CNN) have become de fact state-of-the-art for the main computer vision tasks. However, due to the complex underlying structure their decisions are hard to understand which limits their use in some context of the industrial world. A common and hard to detect challenge in machine learning (ML) tasks is data bias. In this work, we present a systematic approach to uncover data bias by means of attribution maps. For this purpose, first an artificial dataset with a known bias is created and used to train intentionally biased CNNs. The networks' decisions are then inspected using attribution maps. Finally, meaningful metrics are used to measure the attribution maps' representativeness with respect to the known bias. The proposed study shows that some attribution map techniques highlight the presence of bias in the data better than others and metrics can support the identification of bias.
翻译:然而,由于计算机主要愿景任务的基本结构复杂,他们的决定难以理解,因此难以理解如何限制在工业世界的某些背景下使用这些决定。在机器学习(ML)任务中,一个常见和难以发现的挑战是数据偏差。在这项工作中,我们提出了一个系统的方法,通过归属图来发现数据偏差。为此目的,首先创建了已知偏差的人工数据集,并用于培训故意偏向的CNN。然后,利用归属图检查网络的决定。最后,使用了有意义的指标来衡量归属图相对于已知偏差的代表性。拟议的研究表明,某些归属图技术突出数据中存在的偏差比其他方法更好,衡量标准可以支持识别偏差。