A growing number of approaches exist to generate explanations for image classification. However, few of these approaches are subjected to human-subject evaluations, partly because it is challenging to design controlled experiments with natural image datasets, as they leave essential factors out of the researcher's control. With our approach, researchers can describe their desired dataset with only a few parameters. Based on these, our library generates synthetic image data of two 3D abstract animals. The resulting data is suitable for algorithmic as well as human-subject evaluations. Our user study results demonstrate that our method can create biases predictive enough for a classifier and subtle enough to be noticeable only to every second participant inspecting the data visually. Our approach significantly lowers the barrier for conducting human subject evaluations, thereby facilitating more rigorous investigations into interpretable machine learning. For our library and datasets see, https://github.com/mschuessler/two4two/
翻译:然而,这些方法中很少有是人-主题评价,部分原因是很难设计自然图像数据集的受控实验,因为自然图像数据集使基本因素不受研究人员控制。通过我们的方法,研究人员可以只用几个参数来描述他们想要的数据集。根据这些方法,我们的图书馆生成了两个3D抽象动物的合成图像数据。由此产生的数据既适合算法,也适合人-主题评价。我们的用户研究结果表明,我们的方法可以产生足以预测一个分类器的偏差,而且微妙得足以引起目视数据的每个第二位参与者注意。我们的方法大大降低了进行人类主题评价的障碍,从而便利了对可解释的机器学习进行更严格的调查。我们的图书馆和数据集见,https://github.com/mschuessler/222/。