Large health care data repositories such as electronic health records (EHR) opens new opportunities to derive individualized treatment strategies to improve disease outcomes. We study the problem of estimating sequential treatment rules tailored to patient's individual characteristics, often referred to as dynamic treatment regimes (DTRs). We seek to find the optimal DTR which maximizes the discontinuous value function through direct maximization of a fisher consistent surrogate loss function. We show that a large class of concave surrogates fails to be Fisher consistent, which differs from the classic setting for binary classification. We further characterize a non-concave family of Fisher consistent smooth surrogate functions, which can be optimized with gradient descent using off-the-shelf machine learning algorithms. Compared to the existing direct search approach under the support vector machine framework (Zhao et al., 2015), our proposed DTR estimation via surrogate loss optimization (DTRESLO) method is more computationally scalable to large sample size and allows for a broader functional class for the predictor effects. We establish theoretical properties for our proposed DTR estimator and obtain a sharp upper bound on the regret corresponding to our DTRESLO method. Finite sample performance of our proposed estimator is evaluated through extensive simulations and an application on deriving an optimal DTR for treatment sepsis using EHR data from patients admitted to intensive care units.
翻译:大型保健数据储存库,如电子健康记录(EHR),开辟了新的机会,以得出个人化治疗战略来改善疾病结果。我们研究了根据病人个人特征(通常称为动态治疗机制)来估计顺序治疗规则的问题。我们寻求找到最佳的DTR,通过直接最大限度地增加渔民一致的代谢损失功能,最大限度地发挥不连续的价值功能。我们表明,一大批混凝土代谢器与典型的二进制分类环境不同,与典型的二进制分类不同。我们进一步描述一个非凝固的Fisher 连续平稳代理功能大家庭,这种功能可以使用现成机器学习算法,以梯度下降优化。与支持矢量机框架(Zhao等人,2015年)下的现有直接搜索方法相比,我们提议的通过代谢损失最优化(DRESLO)方法进行最大不连续的估算更符合计算尺度,并允许更广泛的功能类别用于预测效应。我们提议的DTR测量仪的理论属性,并获得一种精锐的顶端梯底底底底底底底底底部治疗。我们通过模拟的SINDRAFIDSimal Aserviquestal 进行最佳模拟的模拟的模拟分析。