A/B tests serve the purpose of reliably identifying the effect of changes introduced in online services. It is common for online platforms to run a large number of simultaneous experiments by splitting incoming user traffic randomly in treatment and control groups. Despite a perfect randomization between different groups, simultaneous experiments can interact with each other and create a negative impact on average population outcomes such as engagement metrics. These are measured globally and monitored to protect overall user experience. Therefore, it is crucial to measure these interaction effects and attribute their overall impact in a fair way to the respective experimenters. We suggest an approach to measure and disentangle the effect of simultaneous experiments by providing a cost sharing approach based on Shapley values. We also provide a counterfactual perspective, that predicts shared impact based on conditional average treatment effects making use of causal inference techniques. We illustrate our approach in real world and synthetic data experiments.
翻译:A/B测试的目的是可靠地查明在线服务改革的影响,在治疗和控制组中,通常在线平台进行大量同时实验,随机地将用户流量分解为不同的用户流量。尽管不同群体之间完全随机分解,但同时实验可以相互作用,对平均人口结果产生消极影响,如参与指标。这些测试是全球测量和监测的,以保护用户的总体经验。因此,衡量这些相互作用效应并将其总体影响公平地归因于各自的实验者至关重要。我们建议采取一种方法,通过提供一种基于Shapley值的费用分摊办法,衡量并消除同时实验的影响。我们还提供了一种反事实观点,根据使用因果关系推断技术的有条件平均治疗效应预测共同影响。我们介绍了我们在现实世界和合成数据实验中采用的方法。我们介绍了我们在现实世界和合成数据实验中采用的方法。