We propose a new Markov Decision Process (MDP) model for ad auctions to capture the user response to the quality of ads, with the objective of maximizing the long-term discounted revenue. By incorporating user response, our model takes into consideration all three parties involved in the auction (advertiser, auctioneer, and user). The state of the user is modeled as a user-specific click-through rate (CTR) with the CTR changing in the next round according to the set of ads shown to the user in the current round. We characterize the optimal mechanism for this MDP as a Myerson's auction with a notion of modified virtual value, which relies on the value distribution of the advertiser, the current user state, and the future impact of showing the ad to the user. Moreover, we propose a simple mechanism built upon second price auctions with personalized reserve prices and show it can achieve a constant-factor approximation to the optimal long term discounted revenue.
翻译:我们提出了一个新的Markov决策程序(MDP)拍卖模式,以捕捉用户对广告质量的反应,目的是最大限度地增加长期折扣收入。通过纳入用户反应,我们的模型考虑到拍卖所涉所有三个当事方(广告商、拍卖商和用户),用户状态建模为用户专用点击率(CTR ), 下一轮CTR根据本回合向用户展示的广告组合变化。 我们把这次MDP的最佳机制描述为Myerson拍卖,其概念是修改虚拟价值,依赖广告商的价值分配、当前用户状况以及向用户展示广告的未来影响。 此外,我们提出了基于第二次价格拍卖的简单机制,以个人化储备价格为基础,并表明它可以实现与最佳长期折扣收入的不变性近似。