The real-world testing of decisions made using causal machine learning models is an essential prerequisite for their successful application. We focus on evaluating and improving contextual treatment assignment decisions: these are personalised treatments applied to e.g. customers, each with their own contextual information, with the aim of maximising a reward. In this paper we introduce a model-agnostic framework for gathering data to evaluate and improve contextual decision making through Bayesian Experimental Design. Specifically, our method is used for the data-efficient evaluation of the regret of past treatment assignments. Unlike approaches such as A/B testing, our method avoids assigning treatments that are known to be highly sub-optimal, whilst engaging in some exploration to gather pertinent information. We achieve this by introducing an information-based design objective, which we optimise end-to-end. Our method applies to discrete and continuous treatments. Comparing our information-theoretic approach to baselines in several simulation studies demonstrates the superior performance of our proposed approach.
翻译:利用因果机学习模型对决策进行真实世界的测试是成功应用这些模型的必要先决条件。我们注重评价和改进背景处理任务决定:这些是个人化的处理方法,适用于客户,每个客户都有自己的背景信息,目的是最大限度地获得奖励。在本文中,我们引入了一个模型 -- -- 不可知性框架,用于收集数据,通过巴伊西亚实验设计来评价和改进背景决策。具体地说,我们的方法用于对以往治疗任务的遗憾进行数据高效评价。与A/B测试等方法不同,我们的方法避免分配已知的高度次优的治疗方法,同时进行一些探索以收集相关信息。我们通过引入基于信息的设计目标来实现这一目标,我们优化终端到终端。我们的方法适用于离散和连续的处理方法。在几项模拟研究中比较我们的信息理论方法与基线方法,显示了我们拟议方法的优异性表现。