The Synthetic Control method has pioneered a class of powerful data-driven techniques to estimate the counterfactual reality of a unit from donor units. At its core, the technique involves a linear model fitted on the pre-intervention period that combines donor outcomes to yield the counterfactual. However, linearly combining spatial information at each time instance using time-agnostic weights fails to capture important inter-unit and intra-unit temporal contexts and complex nonlinear dynamics of real data. We instead propose an approach to use local spatiotemporal information before the onset of the intervention as a promising way to estimate the counterfactual sequence. To this end, we suggest a Transformer model that leverages particular positional embeddings, a modified decoder attention mask, and a novel pre-training task to perform spatiotemporal sequence-to-sequence modeling. Our experiments on synthetic data demonstrate the efficacy of our method in the typical small donor pool setting and its robustness against noise. We also generate actionable healthcare insights at the population and patient levels by simulating a state-wide public health policy to evaluate its effectiveness, an in silico trial for asthma medications to support randomized controlled trials, and a medical intervention for patients with Friedreich's ataxia to improve clinical decision-making and promote personalized therapy.
翻译:合成控制方法开创了一组强大的数据驱动技术,以估计一个单位从捐助单位获得的反现实现实。在其核心方面,该技术涉及一个在干预前时期安装的线性模型,该模型结合了捐赠者的结果以产生反事实。然而,在每次使用时间-意识加权数的案例中,线性地结合空间信息,未能捕捉到重要的单位间和单位内时间背景以及真实数据的复杂的非线性动态。我们相反地提议了一种方法,在干预开始之前使用当地随机信息,作为估计反事实序列的有希望的方法。为此,我们建议采用一种变异模型,利用特定的定位嵌入、修改的解coder注意面具,以及一个新的培训前任务,以进行超时序序列序列式模型的模拟。我们对合成数据的实验表明我们的方法在典型的小型捐赠者群体环境中的功效及其抵御噪音的稳健性。我们还在人口和病人层面产生可采取行动的保健洞察力,通过模拟一项全州公共卫生政策来评估其有效性,在用于改进个人治疗性临床试验的理疗程试验中,用一种可控制的理疗理疗程性疗法来改进个人治疗。