We study the problem of learning individualized dose intervals using observational data. There are very few previous works for policy learning with continuous treatment, and all of them focused on recommending an optimal dose rather than an optimal dose interval. In this paper, we propose a new method to estimate such an optimal dose interval, named probability dose interval (PDI). The potential outcomes for doses in the PDI are guaranteed better than a pre-specified threshold with a given probability (e.g., 50%). The associated nonconvex optimization problem can be efficiently solved by the Difference-of-Convex functions (DC) algorithm. We prove that our estimated policy is consistent, and its risk converges to that of the best-in-class policy at a root-n rate. Numerical simulations show the advantage of the proposed method over outcome modeling based benchmarks. We further demonstrate the performance of our method in determining individualized Hemoglobin A1c (HbA1c) control intervals for elderly patients with diabetes.
翻译:我们利用观测数据研究学习个体化剂量间隔的问题。 以往很少有关于持续治疗的政策学习工作,所有这些工作都侧重于建议最佳剂量而不是最佳剂量间隔。 在本文中,我们提出了一种新的方法来估计这种最佳剂量间隔,称为概率剂量间隔(PDI)。 PDI中剂量的潜在结果保证优于特定概率(例如50%)的预先规定的阈值。相关的非电流优化问题可以通过Convex差异函数算法(DC)有效解决。我们证明,我们的估计政策是一致的,其风险与按根值计算的最佳等级政策的风险是一致的。数字模拟显示了拟议方法相对于基于结果的模型基准的优势。我们进一步展示了我们确定糖尿病老年病人个人化Hemiglobin A1c(HbA1c)控制间隔的方法的绩效。