AutoML systems can speed up routine data science work and make machine learning available to those without expertise in statistics and computer science. These systems have gained traction in enterprise settings where pools of skilled data workers are limited. In this study, we conduct interviews with 29 individuals from organizations of different sizes to characterize how they currently use, or intend to use, AutoML systems in their data science work. Our investigation also captures how data visualization is used in conjunction with AutoML systems. Our findings identify three usage scenarios for AutoML that resulted in a framework summarizing the level of automation desired by data workers with different levels of expertise. We surfaced the tension between speed and human oversight and found that data visualization can do a poor job balancing the two. Our findings have implications for the design and implementation of human-in-the-loop visual analytics approaches.
翻译:自动ML系统可以加快日常数据科学工作,使那些没有统计和计算机科学专门知识的人能够利用机器学习。这些系统在技术熟练数据工作者队伍有限的企业环境中获得了牵引力。在本研究中,我们与来自不同规模组织的29名个人进行了访谈,以说明他们目前如何或打算如何在其数据科学工作中使用或打算使用自动ML系统。我们的调查还记录了数据可视化如何与自动ML系统结合使用。我们的调查结果确定了自动ML的三种使用情景,从而形成了一个框架,概述了具有不同专门知识的数据工作者所期望的自动化水平。我们暴露了速度与人监督之间的紧张关系,发现数据可视化对两者的平衡作用很差。我们的调查结果对设计和实施 " 实时人 " 视觉分析方法产生了影响。