Machine learning (ML) is being applied to a diverse and ever-growing set of domains. In many cases, domain experts - who often have no expertise in ML or data science - are asked to use ML predictions to make high-stakes decisions. Multiple ML usability challenges can appear as result, such as lack of user trust in the model, inability to reconcile human-ML disagreement, and ethical concerns about oversimplification of complex problems to a single algorithm output. In this paper, we investigate the ML usability challenges that present in the domain of child welfare screening through a series of collaborations with child welfare screeners. Following the iterative design process between the ML scientists, visualization researchers, and domain experts (child screeners), we first identified four key ML challenges and honed in on one promising explainable ML technique to address them (local factor contributions). Then we implemented and evaluated our visual analytics tool, Sibyl, to increase the interpretability and interactivity of local factor contributions. The effectiveness of our tool is demonstrated by two formal user studies with 12 non-expert participants and 13 expert participants respectively. Valuable feedback was collected, from which we composed a list of design implications as a useful guideline for researchers who aim to develop an interpretable and interactive visualization tool for ML prediction models deployed for child welfare screeners and other similar domain experts.
翻译:机器学习(ML)正在应用于多样化和不断增长的一组领域。在许多情况下,要求领域专家(通常在ML或数据科学方面没有专门知识)利用ML预测的迭接设计过程来做出高考决定。多重ML使用挑战可能因此出现,例如用户对模型缺乏信任,无法调和人类-ML的分歧,以及对将复杂问题简化为单一算法产出的道德关切。在本文件中,我们通过与儿童福利筛选员的一系列合作,调查儿童福利领域存在的ML可用性挑战。在ML科学家、可视化研究人员和域专家(儿童屏幕员)之间的迭接设计过程之后,我们首先确定了四项关键的ML挑战,并精细地探讨了一种有希望解释的ML技术来解决这些挑战(当地因素贡献)。然后,我们实施并评估了我们的视觉分析工具Sibyl,以提高当地因素贡献的可解释性和互用性。我们的工具的有效性通过两个正式用户研究得到证明,12名非专家以及13名专家参与者进行了合作。在ML的迭接合的设计过程中,我们首先确定了四项关键的ML挑战。我们收集了对部署的交互式域预测工具的可理解。