Real-time bidding (RTB) systems, which leverage auctions to programmatically allocate user impressions to multiple competing advertisers, continue to enjoy widespread success in digital advertising. Assessing the effectiveness of such advertising remains a lingering challenge in research and practice. This paper presents a new experimental design to perform causal inference on advertising bought through such mechanisms. Our method leverages the economic structure of first- and second-price auctions, which are ubiquitous in RTB systems, embedded within a multi-armed bandit (MAB) setup for online adaptive experimentation. We implement it via a modified Thompson sampling (TS) algorithm that estimates causal effects of advertising while minimizing the costs of experimentation to the advertiser by simultaneously learning the optimal bidding policy that maximizes her expected payoffs from auction participation. Simulations show that not only the proposed method successfully accomplishes the advertiser's goals, but also does so at a much lower cost than more conventional experimentation policies aimed at performing causal inference.
翻译:实时投标(RTB)系统利用拍卖手段,将用户印象按方案分配给多个相互竞争的广告商,这种系统在数字广告方面继续享有广泛成功。评估这种广告的实效仍然是研究和实践中一个长期存在的挑战。本文介绍了对通过这种机制购买的广告进行因果关系推断的新实验设计。我们的方法利用了先价和第二价拍卖的经济结构,这些拍卖在RTB系统中无处不在,这种结构嵌入了网上适应实验的多臂强盗(MAB)系统。我们通过经过修改的Thompson抽样算法来实施这一算法,该算法通过同时学习最佳投标政策,最大限度地实现她预期的拍卖收益,同时将广告商的实验成本降到最低。模拟表明,不仅拟议的方法成功地实现了广告商的目标,而且比旨在进行因果关系推断的常规实验政策的成本低得多。