Online controlled experiments are the primary tool for measuring the causal impact of product changes in digital businesses. It is increasingly common for digital products and services to interact with customers in a personalised way. Using online controlled experiments to optimise personalised interaction strategies is challenging because the usual assumption of statistically equivalent user groups is violated. Additionally, challenges are introduced by users qualifying for strategies based on dynamic, stochastic attributes. Traditional A/B tests can salvage statistical equivalence by pre-allocating users to control and exposed groups, but this dilutes the experimental metrics and reduces the test power. We present a stacked incrementality test framework that addresses problems with running online experiments for personalised user strategies. We derive bounds that show that our framework is superior to the best simple A/B test given enough users and that this condition is easily met for large scale online experiments. In addition, we provide a test power calculator and describe a selection of pitfalls and lessons learnt from our experience using it.
翻译:在线控制实验是衡量数字企业产品变化因果影响的主要工具,数字产品和服务以个性化方式与客户互动越来越常见。使用在线控制实验优化个性化互动战略具有挑战性,因为通常的统计等同用户群体假设被违反。此外,符合动态、随机特征战略标准的用户提出了挑战。传统的A/B测试可以通过预先将用户分配给控制和接触群体来挽救统计等值,但这会淡化实验指标并减少测试力。我们提出了一个堆叠式递增测试框架,解决个人化用户战略在线实验的问题。我们得出的界限表明,我们的框架优于最简单的A/B测试,给足够用户提供最简单的A/B测试,而且这一条件很容易在大规模在线实验中得到满足。此外,我们提供了测试能力计算器,并描述了从我们使用它的经验中得出的陷阱和教训选择。