In this paper, we study the learning problem in contextual search, which is motivated by applications such as first-price auction, personalized medicine experiments, and feature-based pricing experiments. In particular, for a sequence of arriving context vectors, with each context associated with an underlying value, the decision-maker either makes a query at a certain point or skips the context. The decision-maker will only observe the binary feedback on the relationship between the query point and the value associated with the context. We study a PAC learning setting, where the goal is to learn the underlying mean value function in context with a minimum number of queries. To address this challenge, we propose a tri-section search approach combined with a margin-based active learning method. We show that the algorithm only needs to make $O(1/\varepsilon^2)$ queries to achieve an $\epsilon$-estimation accuracy. This sample complexity significantly reduces the required sample complexity in the passive setting, at least $\Omega(1/\varepsilon^4)$.
翻译:在本文中,我们研究了背景搜索中的学习问题,这是由第一价格拍卖、个性化医学实验和基于特征的定价实验等应用驱动的。特别是,对于抵达环境矢量的顺序,以及每个与基本价值相关的背景,决策者要么在某个点进行查询,要么跳过上下文。决策者将只观察关于查询点与上下文相关价值之间关系的二进制反馈。我们研究PAC学习环境,目的是在最小的查询次数中学习基本平均值功能。为了应对这一挑战,我们建议采用三部分搜索方法,同时采用基于边际的积极学习方法。我们表明,算法只需查询1美元(1/\varepsilon2),即可达到美元(varepsilon)的估算准确度。这种抽样复杂度将大大降低被动环境中所需的抽样复杂性,至少为$/Omega(1/\varepsilon4)美元。