Contextual information in search sessions is important for capturing users' search intents. Various approaches have been proposed to model user behavior sequences to improve document ranking in a session. Typically, training samples of (search context, document) pairs are sampled randomly in each training epoch. In reality, the difficulty to understand user's search intent and to judge document's relevance varies greatly from one search context to another. Mixing up training samples of different difficulties may confuse the model's optimization process. In this work, we propose a curriculum learning framework for context-aware document ranking, in which the ranking model learns matching signals between the search context and the candidate document in an easy-to-hard manner. In so doing, we aim to guide the model gradually toward a global optimum. To leverage both positive and negative examples, two curricula are designed. Experiments on two real query log datasets show that our proposed framework can improve the performance of several existing methods significantly, demonstrating the effectiveness of curriculum learning for context-aware document ranking.
翻译:搜索过程中的上下文信息对于捕捉用户搜索意图十分重要。 已经提出了各种方法来模拟用户行为序列,以改进某场文档的排序。 通常,在每一场培训中随机抽取培训样本( 搜索背景、 文件) 。 事实上, 要理解用户的搜索意图和判断文件的相关性, 困难在不同的搜索背景之间差别很大。 混合不同困难的培训样本可能会混淆模型的优化进程。 在这项工作中, 我们提出了一个背景识别文件排序的课程学习框架, 排名模型以简单易懂的方式学习搜索背景与候选文件之间的匹配信号。 这样, 我们的目标是逐渐引导模型走向全球最佳化。 为了利用正面和负面的例子, 设计了两个课程。 在两个真正的查询日志数据集上进行的实验表明,我们提议的框架可以显著地改进若干现有方法的性能, 展示了背景识别文件排序课程学习的效果。