Developing efficient sequential bidding strategies for repeated auctions is an important practical challenge in various marketing tasks. In this setting, the bidding agent obtains information, on both the value of the item at sale and the behavior of the other bidders, only when she wins the auction. Standard bandit theory does not apply to this problem due to the presence of action-dependent censoring. In this work, we consider second-price auctions and propose novel, efficient UCB-like algorithms for this task. These algorithms are analyzed in the stochastic setting, assuming regularity of the distribution of the opponents' bids. We provide regret upper bounds that quantify the improvement over the baseline algorithm proposed in the literature. The improvement is particularly significant in cases when the value of the auctioned item is low, yielding a spectacular reduction in the order of the worst-case regret. We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies. As an alternative, we propose more explainable strategies which are reminiscent of the Explore Then Commit bandit algorithm. We provide a critical analysis of this class of strategies, showing both important advantages and limitations. In particular, we provide a minimax lower bound and propose a nearly minimax-optimal instance of this class.
翻译:为多次拍卖制定高效的顺序招标战略是各种营销任务中一项重要的实际挑战。在这一背景下,投标代理人只有在她赢得拍卖时,才获得关于销售物品的价值和其他投标人行为的信息。标准匪帮理论不适用于这一问题,因为存在依赖行动的审查,因此标准匪帮理论不适用于这一问题。在这项工作中,我们考虑二价拍卖,并为这项任务提出新的、高效的UCB类算法。这些算法是在随机分析环境中分析的,假设对对手的标书的分布是正常的。我们遗憾地提供了比文献中提议的基线算法改进程度量化的上限。当拍卖物品的价值低,导致最坏情况的遗憾程度大幅下降时,这种改进尤其重要。我们进一步提供了适用于通用UCB类战略的这一问题的第一等低参数。作为替代办法,我们提出了比Explorate The Ty Center Adroital算法更具可解释性的战略。我们对这一类战略进行了批判性的分析,既展示了重要的优势,也展示了近乎极限的微量级限制。我们特别提供了一种微量级的微级算法。