Online advertising has motivated interest in online selection problems. Displaying ads to the right users benefits both the platform (e.g., via pay-per-click) and the advertisers (by increasing their reach). In practice, not all users click on displayed ads, while the platform's algorithm may miss the users most disposed to do so. This mismatch decreases the platform's revenue and the advertiser's chances to reach the right customers. With this motivation, we propose a secretary problem where a candidate may or may not accept an offer according to a known probability $p$. Because we do not know the top candidate willing to accept an offer, the goal is to maximize a robust objective defined as the minimum over integers $k$ of the probability of choosing one of the top $k$ candidates, given that one of these candidates will accept an offer. Using Markov decision process theory, we derive a linear program for this max-min objective whose solution encodes an optimal policy. The derivation may be of independent interest, as it is generalizable and can be used to obtain linear programs for many online selection models. We further relax this linear program into an infinite counterpart, which we use to provide bounds for the objective and closed-form policies. For $p \geq p^* \approx 0.6$, an optimal policy is a simple threshold rule that observes the first $p^{1/(1-p)}$ fraction of candidates and subsequently makes offers to the best candidate observed so far.
翻译:在线广告激发了人们对在线选择问题的兴趣。 向合适的用户展示广告有利于平台( 例如通过按每点击付费)和广告商( 增加其影响范围 ) 。 实际上, 不是所有用户都点击显示的广告, 而平台的算法可能错过最愿意这样做的用户。 这种不匹配会减少平台的收入和广告商接触正确客户的机会。 有了这个动机, 我们提出了一个秘书问题, 候选人可能接受或可能不接受已知的概率为$1。 由于我们不知道最高级候选人愿意接受报价, 目标是最大限度地实现一个强有力的目标, 即选择最高候选人之一的概率至少超过 $k$1 k, 而考虑到其中一位候选人会接受报价。 使用 Markov 决策程序理论, 我们为这一最大目标的线性程序提供了一条线性程序, 其解决方案将编码为最佳政策。 推算出一个独立的利益, 因为它是普遍化的, 并且可以用来为许多在线选择模式获得线性程序。 我们进一步将这个最起码的线性程序放松到一个绝对的对等 。