Conventional methods for query autocompletion aim to predict which completed query a user will select from a list. A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions. To overcome this limitation, we propose a new approach that explicitly optimizes the query suggestions for downstream retrieval performance. We formulate this as a problem of ranking a set of rankings, where each query suggestion is represented by the downstream item ranking it produces. We then present a learning method that ranks query suggestions by the quality of their item rankings. The algorithm is based on a counterfactual learning approach that is able to leverage feedback on the items (e.g., clicks, purchases) to evaluate query suggestions through an unbiased estimator, thus avoiding the assumption that users write or select optimal queries. We establish theoretical support for the proposed approach and provide learning-theoretic guarantees. We also present empirical results on publicly available datasets, and demonstrate real-world applicability using data from an online shopping store.
翻译:用于查询自动补全的常规方法旨在预测用户将从列表中选择的查询完成时间。 这种方法的一个缺点是,用户往往不知道哪个查询能提供当前信息检索系统的最佳检索性能,这意味着任何经过培训的模拟用户行为的自动补全方法都可能导致不优化的查询建议。 为了克服这一限制,我们建议了一种新的方法,明确优化下游检索性能的查询建议。我们把这个方法作为排列一系列排名的问题来拟订,其中每个查询建议都由下游项目排名来代表。然后,我们提出一种根据项目排名质量对查询建议进行排序的学习方法。算法的基础是一种反事实学习方法,能够利用项目反馈(例如,点击,购买)来评价查询建议,从而避免假设用户会写或选择最佳查询。我们为拟议的方法提供理论支持,并提供学习理论保证。我们还介绍了公开数据集的经验结果,并利用网上购物商店的数据显示真实世界的适用性。