A popular approach to selling online advertising is by a waterfall, where a publisher makes sequential price offers to ad networks for an inventory, and chooses the winner in that order. The publisher picks the order and prices to maximize her revenue. A traditional solution is to learn the demand model and then subsequently solve the optimization problem for the given demand model. This will incur a linear regret. We design an online learning algorithm for solving this problem, which interleaves learning and optimization, and prove that this algorithm has sublinear regret. We evaluate the algorithm on both synthetic and real-world data, and show that it quickly learns high quality pricing strategies. This is the first principled study of learning a waterfall design online by sequential experimentation.
翻译:销售在线广告的流行方式是瀑布, 出版商向广告网络竞价出价, 并以此顺序选择赢家。 出版商选择订单和价格, 以最大限度地增加收入。 传统的解决方案是学习需求模式, 并随后解决特定需求模式的优化问题。 这将引起线性遗憾。 我们设计在线学习算法, 解决这个问题, 它会中断学习和优化, 并证明这一算法有亚线性遗憾 。 我们评估合成和真实世界数据的算法, 并显示它能快速学习高质量的定价战略。 这是通过连续实验在网上学习瀑布设计的第一个原则性研究 。