Realizing when a model is right for a wrong reason is not trivial and requires a significant effort by model developers. In some cases an input salience method, which highlights the most important parts of the input, may reveal problematic reasoning. But scrutinizing highlights over many data instances is tedious and often infeasible. Furthermore, analyzing examples in isolation does not reveal general patterns in the data or in the model's behavior. In this paper we aim to address these issues and go from understanding single examples to understanding entire datasets and models. The methodology we propose is based on aggregated salience maps, to which we apply clustering, nearest neighbor search and visualizations. Using this methodology we address multiple distinct but common model developer needs by showing how problematic data and model behavior can be identified and explained -- a necessary first step for improving the model.
翻译:实现模型正确的原因并非微不足道,需要模型开发者做出重大努力。 在某些情况下, 输入突出方法( 突出输入最重要的部分) 可能会暴露出有问题的推理。 但是, 仔细审查许多数据实例的亮点是乏味的, 并且往往不可行。 此外, 孤立地分析示例并不能揭示数据或模型行为的一般模式。 在本文中, 我们的目标是解决这些问题, 从理解单个示例到理解整个数据集和模型。 我们建议的方法是基于汇总的突出地图, 我们使用集成、 最近的邻居搜索和可视化。 使用这种方法, 我们通过展示问题的数据和模型行为如何识别和解释, 解决多种不同但共同的模型开发者的需要。 这是改进模型的必要的第一步。