Automated Machine Learning (AutoML) is a rapidly growing set of technologies that automate the model development pipeline by searching model space and generating candidate models. A critical, final step of AutoML is human selection of a final model from dozens of candidates. In current AutoML systems, selection is supported only by performance metrics. Prior work has shown that in practice, people evaluate ML models based on additional criteria, such as the way a model makes predictions. Comparison may happen at multiple levels, from types of errors, to feature importance, to how the model makes predictions of specific instances. We developed \tool{} to support interactive model comparison for AutoML by integrating multiple Explainable AI (XAI) and visualization techniques. We conducted a user study in which we both evaluated the system and used it as a technology probe to understand how users perform model comparison in an AutoML system. We discuss design implications for utilizing XAI techniques for model comparison and supporting the unique needs of data scientists in comparing AutoML models.
翻译:自动机器学习(自动学习)是一套迅速增长的技术,通过搜索模型空间和生成候选模型使模型开发管道自动化。自动ML的最后一个关键步骤是人类从数十名候选人中选择最后模型。在目前的自动ML系统中,选择只得到性能衡量标准的支持。以前的工作表明,在实践中,人们根据额外标准,例如模型作出预测的方式,对ML模型进行评估。比较可能发生在多个层次,从错误类型到突出重要性,到模型如何预测具体实例。我们开发了工具,通过整合多种可解释的AI(XAI)和可视化技术,支持自动ML(Auto)的互动模型比较。我们进行了用户研究,我们既评价了系统,又将其作为技术探测工具,以了解用户如何在自动ML系统中进行模型比较。我们讨论了使用XAI技术进行模型比较以及支持数据科学家在比较自动ML模型方面的独特需要的设计影响。