Existing statistical methods can estimate a policy, or a mapping from covariates to decisions, which can then instruct decision makers (e.g., whether to administer hypotension treatment based on covariates blood pressure and heart rate). There is great interest in using such data-driven policies in healthcare. However, it is often important to explain to the healthcare provider, and to the patient, how a new policy differs from the current standard of care. This end is facilitated if one can pinpoint the aspects of the policy (i.e., the parameters for blood pressure and heart rate) that change when moving from the standard of care to the new, suggested policy. To this end, we adapt ideas from Trust Region Policy Optimization (TRPO). In our work, however, unlike in TRPO, the difference between the suggested policy and standard of care is required to be sparse, aiding with interpretability. This yields ``relative sparsity," where, as a function of a tuning parameter, $\lambda$, we can approximately control the number of parameters in our suggested policy that differ from their counterparts in the standard of care (e.g., heart rate only). We propose a criterion for selecting $\lambda$, perform simulations, and illustrate our method with a real, observational healthcare dataset, deriving a policy that is easy to explain in the context of the current standard of care. Our work promotes the adoption of data-driven decision aids, which have great potential to improve health outcomes.
翻译:现有的统计方法可以估计策略,即从协变量到决策的映射,这可以指导决策者(例如,基于协变量血压和心率是否使用低血压治疗)。在医疗保健中使用这种数据驱动的策略非常重要。但是,重要的是要向医疗服务提供者和患者解释新政策与当前标准护理的不同之处。如果能够确定建议政策的方面(即血压和心率的参数)与从标准护理到新的建议护理的变化,这将有助于实现这一目标。为此,我们改编了Trust Region Policy Optimization(TRPO)的想法。然而,在我们的工作中,与TRPO不同的是,要求建议护理与标准护理之间的差异具有稀疏性,以便帮助解释性。这产生了“相对稀疏性”,其中,作为调整参数的函数,$\lambda$,我们可以近似控制我们建议的策略中与标准护理中的相应物不同的参数的数量(例如,仅心率)。我们提出了选择$\lambda$的标准,进行了仿真,并通过实际的观察性医疗数据集说明了我们的方法,导出一个易于在当前护理的背景下解释的策略。我们的工作促进了数据驱动的决策辅助工具的采用,这些工具具有显着的改善健康结果的潜力。