We study the problem of inferring heterogeneous treatment effects from time-to-event data. While both the related problems of (i) estimating treatment effects for binary or continuous outcomes and (ii) predicting survival outcomes have been well studied in the recent machine learning literature, their combination -- albeit of high practical relevance -- has received considerably less attention. With the ultimate goal of reliably estimating the effects of treatments on instantaneous risk and survival probabilities, we focus on the problem of learning (discrete-time) treatment-specific conditional hazard functions. We find that unique challenges arise in this context due to a variety of covariate shift issues that go beyond a mere combination of well-studied confounding and censoring biases. We theoretically analyse their effects by adapting recent generalization bounds from domain adaptation and treatment effect estimation to our setting and discuss implications for model design. We use the resulting insights to propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations. We investigate performance across a range of experimental settings and empirically confirm that our method outperforms baselines by addressing covariate shifts from various sources.
翻译:我们研究从时间到活动的数据所产生的不同处理效应问题,虽然在最近的机器学习文献中很好地研究了以下两个相关问题:(一) 估计对二进制或连续结果的处理效应和(二) 预测生存结果,但两者的结合 -- -- 尽管具有高度的实际相关性 -- -- 受到的关注要少得多,由于可靠地估计治疗对瞬时风险和生存概率的影响的最终目标,我们把重点放在学习(分解-时间)特定处理的有条件危险功能的问题,我们发现,由于各种共变转移问题,不仅仅是研究周密的混淆和审查偏见的组合,因此在这方面出现了独特的挑战。我们理论上通过调整最近对领域适应和治疗效应估计的一般性界限来分析其影响,并讨论模型设计的影响。我们利用由此产生的见解提出一种新的深层次学习方法,用于根据平衡的表述来估计特定治疗的危险。我们调查各种实验环境的绩效,并用经验证实我们的方法通过处理各种来源的共变情况变化,超出了基线。