PStrata: 一个用于主分层的 R 软件包 (PStrata: An R Package for Principal Stratification)

Post-treatment confounding is a common problem in causal inference, including special cases of noncompliance, truncation by death, surrogate endpoint, etc. Principal stratification (Frangakis and Rubin 2002) is a general framework for defining and estimating causal effects in the presence of post-treatment confounding. A prominent special case is the instrumental variable approach to noncompliance in randomized experiments (Angrist, Imbens, and Rubin 1996). Despite its versatility, principal stratification is not accessible to the vast majority of applied researchers because its inherent latent mixture structure requires complex inference tools and highly customized programming. We develop the R package PStrata to automatize statistical analysis of principal stratification for several common scenarios. PStrata supports both Bayesian and frequentist paradigms. For the Bayesian paradigm, the computing architecture combines R, C++, Stan, where R provides user-interface, Stan automatizes posterior sampling, and C++ bridges the two by automatically generating Stan code. For the Frequentist paradigm, PStrata implements a triply-robust weighting estimator. PStrata accommodates regular outcomes and time-to-event outcomes with both unstructured and clustered data.

翻译：后处理混杂是因果推断中的常见问题，包括不遵从、死亡截断、替代性终点等特殊情况。主分层（Frangakis 和 Rubin 2002）是一种在后处理混杂存在下定义和估计因果效应的通用框架。一个著名的特例是随机试验中不遵从的工具变量方法（Angrist、Imbens 和 Rubin 1996年）。尽管它的多样性，由于其固有的潜在混合结构需要复杂的推断工具和高度定制的编程，大多数应用研究人员无法使用主分层。我们开发了 R 软件包 PStrata，用于自动化主分层的统计分析多种常见情况。PStrata 支持贝叶斯和频率主义范式。对于贝叶斯模式，计算架构结合了 R、C++ 和 Stan，其中 R 提供用户界面，Stan 自动化后验采样，而 C++ 通过自动生成 Stan 代码桥接了两者。对于频率主义范式，PStrata 实现了一种三重加权估计程序。PStrata 能够适应常规结果和时间到事件结果，并提供无结构和分群数据的支持。