确定具有强化治疗效果的分组的因果规则集 (Causal Rule Sets for Identifying Subgroups with Enhanced Treatment Effect)

A key question in causal inference analyses is how to find subgroups with elevated treatment effects. This paper takes a machine learning approach and introduces a generative model, Causal Rule Sets (CRS), for interpretable subgroup discovery. A CRS model uses a small set of short decision rules to capture a subgroup where the average treatment effect is elevated. We present a Bayesian framework for learning a causal rule set. The Bayesian model consists of a prior that favors simple models for better interpretability as well as avoiding overfitting, and a Bayesian logistic regression that captures the likelihood of data, characterizing the relation between outcomes, attributes, and subgroup membership. The Bayesian model has tunable parameters that can characterize subgroups with various sizes, providing users with more flexible choices of models from the \emph{treatment efficient frontier}. We find maximum a posteriori models using iterative discrete Monte Carlo steps in the joint solution space of rules sets and parameters. To improve search efficiency, we provide theoretically grounded heuristics and bounding strategies to prune and confine the search space. Experiments show that the search algorithm can efficiently recover true underlying subgroups. We apply CRS on public and real-world datasets from domains where interpretability is indispensable. We compare CRS with state-of-the-art rule-based subgroup discovery models. Results show that CRS achieved consistently competitive performance on datasets from various domains, represented by high treatment efficient frontiers.

翻译：在因果推断分析中,一个关键问题是如何找到具有较高治疗效果的分组。本文件采用了机械学习方法,并引入了一种基因模型,即Causal Rules Set(CRS),用于可解释的分组发现。CRS模型使用一套小的短决定规则,以捕捉平均治疗效果较高的分组。我们提出了一个贝叶斯框架,用于学习因果规则集。贝叶斯模型包括一个先期模式,该模式有利于更便于解释的简单模型,以及避免过度调整,以及巴伊西亚物流回归,该模式捕捉数据的可能性,描述结果、属性和分组成员之间的关系。Bayesian模型具有可以描述不同大小分组特征的金枪鱼参数,为用户提供从平均治疗效果中较灵活的模型选择。我们发现一个使用反复的离散的蒙特卡洛步骤来学习因果规则设置和参数的联合解决方案空间。为了提高搜索效率,我们提供了基于理论上的超自然理论和约束战略,并将搜索空间加以限制。实验显示搜索的域域能够描述不同大小分组的分类,为用户提供更灵活的选择模型,从真正的Calal-comalal real real ex real ex realational astistrutes

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/