Due to their promise of superior predictive power relative to human assessment, machine learning models are increasingly being used to support high-stakes decisions. However, the nature of the labels available for training these models often hampers the usefulness of predictive models for decision support. In this paper, we explore the use of historical expert decisions as a rich--yet imperfect--source of information, and we show that it can be leveraged to mitigate some of the limitations of learning from observed labels alone. We consider the problem of estimating expert consistency indirectly when each case in the data is assessed by a single expert, and propose influence functions based methodology as a solution to this problem. We then incorporate the estimated expert consistency into the predictive model meant for decision support through an approach we term label amalgamation. This allows the machine learning models to learn from experts in instances where there is expert consistency, and learn from the observed labels elsewhere. We show how the proposed approach can help mitigate common challenges of learning from observed labels alone, reducing the gap between the construct that the algorithm optimizes for and the construct of interest to experts. After providing intuition and theoretical results, we present empirical results in the context of child maltreatment hotline screenings. Here, we find that (1) there are high-risk cases whose risk is considered by the experts but not wholly captured in the target labels used to train a deployed model, and (2) the proposed approach improves recall for these cases.
翻译:机械学习模型由于对人类评估具有较高的预测能力,因此越来越多地利用机器学习模型来间接估计专家一致性,以支持作出高决策。然而,培训这些模型的标签性质往往妨碍决策支助预测模型的效用。在本文件中,我们探索利用历史专家决定作为丰富但尚不完善的信息来源,我们表明,可以利用这些决定来减轻仅从所观察到的标签中学习的一些局限性。我们认为,当数据中每个案例都由一名专家评估时,就间接地估算专家一致性的问题,并提出以影响功能为基础的方法作为解决这一问题的一种解决办法。然后,我们将估计的专家一致性纳入通过我们用标签组合术语来提供决策支持的预测模型。这样,机器学习模型就可以在专家一致的情况下向专家学习历史专家的决定,并学习其他地方所观察到的标签。我们回顾,拟议的方法如何能够帮助减轻仅从所观察到的标签中学习的共同挑战,缩小算法为专家优化的理念与构建专家兴趣之间的差距。在提供直觉和理论结果后,我们提出了用于决定支持决策支持的预测模型的预测模型的一致性。我们用机器学习模型学习模型向专家学习的经验,在专家中,在专家进行风险评估中,我们研究过的儿童热线筛选时,我们发现这些案例是用来进行风险筛选。