Predictive models are increasingly used to make various consequential decisions in high-stakes domains such as healthcare, finance, and policy. It becomes critical to ensure that these models make accurate predictions, are robust to shifts in the data, do not rely on spurious features, and do not unduly discriminate against minority groups. To this end, several approaches spanning various areas such as explainability, fairness, and robustness have been proposed in recent literature. Such approaches need to be human-centered as they cater to the understanding of the models to their users. However, there is a research gap in understanding the human-centric needs and challenges of monitoring machine learning (ML) models once they are deployed. To fill this gap, we conducted an interview study with 13 practitioners who have experience at the intersection of deploying ML models and engaging with customers spanning domains such as financial services, healthcare, hiring, online retail, computational advertising, and conversational assistants. We identified various human-centric challenges and requirements for model monitoring in real-world applications. Specifically, we found the need and the challenge for the model monitoring systems to clarify the impact of the monitoring observations on outcomes. Further, such insights must be actionable, robust, customizable for domain-specific use cases, and cognitively considerate to avoid information overload.
翻译:预测模型越来越多地被用于在保健、财政和政策等高取量领域做出各种相应的决定; 关键是要确保这些模型作出准确的预测,能够有力地应对数据的变化,不依赖虚假特征,不过分歧视少数群体; 为此,最近文献中提出了涵盖金融服务、保健、雇用、在线零售、计算广告和谈话助理等各个领域的若干办法; 此类办法需要以人为本,因为它们符合用户对模型的了解; 然而,在了解监测机器学习模型(ML)的以人为本的需要和挑战方面存在研究差距; 填补这一差距,我们与13名从业人员进行了访谈研究,这些从业人员在部署ML模型方面拥有交叉经验,并与金融服务、保健、雇用、在线零售、计算广告和谈话助理等客户进行接触。 我们查明了在现实应用中模型监测的各种以人为中心的挑战和要求。 具体地说,我们发现模型监测系统在澄清监测观测结果对结果的影响方面存在的需要和挑战。 为了填补这一空白,我们与13名从业人员进行了访谈,我们在部署ML模型时,与13名从业人员进行了访谈,这些从业人员在部署ML模型模型时,他们之间有着交叉的经验,他们之间的接触,他们,他们必须参与金融服务、保健、雇用、在线零售、在线零售、计算广告广告广告广告广告和谈话助理助理等领域,我们必须考虑对具体案例进行严格、可操作的超时考虑。