Across domains such as medicine, employment, and criminal justice, predictive models often target labels that imperfectly reflect the outcomes of interest to experts and policymakers. For example, clinical risk assessments deployed to inform physician decision-making often predict measures of healthcare utilization (e.g., costs, hospitalization) as a proxy for patient medical need. These proxies can be subject to outcome measurement error when they systematically differ from the target outcome they are intended to measure. However, prior modeling efforts to characterize and mitigate outcome measurement error overlook the fact that the decision being informed by a model often serves as a risk-mitigating intervention that impacts the target outcome of interest and its recorded proxy. Thus, in these settings, addressing measurement error requires counterfactual modeling of treatment effects on outcomes. In this work, we study intersectional threats to model reliability introduced by outcome measurement error, treatment effects, and selection bias from historical decision-making policies. We develop an unbiased risk minimization method which, given knowledge of proxy measurement error properties, corrects for the combined effects of these challenges. We also develop a method for estimating treatment-dependent measurement error parameters when these are unknown in advance. We demonstrate the utility of our approach theoretically and via experiments on real-world data from randomized controlled trials conducted in healthcare and employment domains. As importantly, we demonstrate that models correcting for outcome measurement error or treatment effects alone suffer from considerable reliability limitations. Our work underscores the importance of considering intersectional threats to model validity during the design and evaluation of predictive models for decision support.
翻译:在医学、就业和刑事司法等各个领域,预测模型往往以不完全反映专家和决策者感兴趣的结果的标签为目标,例如,为向医生决策提供信息而部署的临床风险评估往往预测作为病人医疗需要的替代物的保健利用措施(例如费用、住院),这些代理物可能受到结果衡量错误的影响,如果它们与它们打算衡量的目标结果有系统性差异,则这些代理物可能会受到结果衡量错误的影响。然而,先前为描述和减轻结果衡量错误而进行的模拟努力忽略了一个事实,即由模型通报的决定往往起到风险缓解干预的作用,影响利益目标结果和记录代用品的代用品。因此,在这些环境中,处理计量错误需要反实际地模拟治疗结果对结果的影响。在这项工作中,我们研究由于成果衡量错误、治疗效果和历史决策政策的偏差而导致的模式对可靠性造成的相互交错的威胁。我们开发了一种不带偏见的风险最小化风险最小化方法,根据对代用计量错误模型的特性,纠正了这些挑战的综合影响。我们还开发了一种方法,用于估算基于治疗的计量错误参数参数,而这些参数在事先是未知的。因此,我们通过在设计风险领域进行有一定的测试,因此,我们通过理论上进行了对计量结果的检验。