Presenting the complexities of a model's performance is a communication bottleneck that threatens collaborations between data scientists and subject matter experts. Accuracy and error metrics alone fail to tell the whole story of a model - its risks, strengths, and limitations - making it difficult for subject matter experts to feel confident in deciding to use a model. As a result, models may fail in unexpected ways if their weaknesses are not clearly understood. Alternatively, models may go unused, as subject matter experts disregard poorly presented models in favor of familiar, yet arguably substandard methods. In this paper, we propose effective use of visualization as a medium for communication between data scientists and subject matter experts. Our research addresses the gap between common practices in model performance communication and the understanding of subject matter experts and decision makers. We derive a set of communication guidelines and recommended visualizations for communicating model performance based on interviews of both data scientists and subject matter experts at the same organization. We conduct a follow-up study with subject matter experts to evaluate the efficacy of our guidelines in presentations of model performance with and without our recommendations. We find that our proposed guidelines made subject matter experts more aware of the tradeoffs of the presented model. Participants realized that current communication methods left them without a robust understanding of the model's performance, potentially giving them misplaced confidence in the use of the model.
翻译:提出模型性能的复杂性是一个通信瓶颈,威胁到数据科学家和主题事项专家之间的合作; 精确度和误差度单靠精确度和误差度无法说明模型的整个故事——其风险、长处和局限性,使主题事项专家难以感到有信心决定使用模型; 结果,模型的弱点如果不被清楚理解,就可能以出乎意料的方式失败; 或者,模型可能没有使用,因为主题事项专家忽视了模型性能的模型,而忽略了模型性能的模型,而偏好于熟悉的、但可以说不合标准的方法; 在本文件中,我们建议有效利用可视化作为数据科学家和主题事项专家之间交流的媒介; 我们的研究解决了模型性能交流的共同做法与专题专家和决策者理解之间的差距; 我们制定了一套通信准则,用以根据对数据科学家和同一组织的专题专家的访谈,交流模型性能; 我们与主题事项专家进行后续研究,以评价我们在介绍模型性能和没有我们的建议时的指导方针的有效性。 我们发现,我们提出的准则使主题事项专家更了解模型的利弊,而没有很好地理解模型。