We propose D-RISE, a method for generating visual explanations for the predictions of object detectors. Utilizing the proposed similarity metric that accounts for both localization and categorization aspects of object detection allows our method to produce saliency maps that show image areas that most affect the prediction. D-RISE can be considered "black-box" in the software testing sense, as it only needs access to the inputs and outputs of an object detector. Compared to gradient-based methods, D-RISE is more general and agnostic to the particular type of object detector being tested, and does not need knowledge of the inner workings of the model. We show that D-RISE can be easily applied to different object detectors including one-stage detectors such as YOLOv3 and two-stage detectors such as Faster-RCNN. We present a detailed analysis of the generated visual explanations to highlight the utilization of context and possible biases learned by object detectors.
翻译:我们建议D-RISE,这是对物体探测器预测进行视觉解释的一种方法。 利用用于物体探测的定位和分类方面的拟议相似度衡量方法,使我们能够制作显要的地图,显示对预测影响最大的图像区域。 D-RISE可以被视为软件测试意义上的“黑箱”,因为它只需要获得物体探测器的投入和产出。与基于梯度的方法相比,D-RISE对于正在测试的特定类型的物体探测器来说更为笼统和不可知性,不需要了解该模型的内部功能。我们表明D-RISE很容易适用于不同的物体探测器,包括一阶段探测器,如YOLOv3,和两阶段探测器,如Apper-RCNN。我们对生成的直观解释进行详细分析,以突出物体探测器对上下文的利用和可能学到的偏差。