Users' search tasks have become increasingly complicated, requiring multiple queries and interactions with the results. Recent studies have demonstrated that modeling the historical user behaviors in a session can help understand the current search intent. Existing context-aware ranking models primarily encode the current session sequence (from the first behavior to the current query) and compute the ranking score using the high-level representations. However, there is usually some noise in the current session sequence (useless behaviors for inferring the search intent) that may affect the quality of the encoded representations. To help the encoding of the current user behavior sequence, we propose to use a decoder and the information of future sequences and a supplemental query. Specifically, we design three generative tasks that can help the encoder to infer the actual search intent: (1) predicting future queries, (2) predicting future clicked documents, and (3) predicting a supplemental query. We jointly learn the ranking task with these generative tasks using an encoder-decoder structured approach. Extensive experiments on two public search logs demonstrate that our model outperforms all existing baselines, and the designed generative tasks can actually help the ranking task. Besides, additional experiments also show that our approach can be easily applied to various Transformer-based encoder-decoder models and improve their performance.
翻译:用户的搜索任务变得日益复杂, 需要多次查询和与结果互动。 最近的研究显示, 模拟一个会话中的历史用户行为可以帮助理解当前的搜索意图。 现有的符合背景的排序模式主要是将当前会话序列编码( 从第一个行为到当前查询), 并使用高级别演示来计算排名。 但是, 通常会话序列中有一些噪音( 用于推断搜索意图的无用行为) 可能影响编码显示的质量 。 为了帮助编码当前用户行为序列的编码, 我们提议使用解码器和关于未来序列的信息以及补充查询。 具体地说, 我们设计了三种能够帮助编码者推断实际搜索意图的基因化任务:(1) 预测未来查询,(2) 预测未来点击文件,(3) 预测补充查询。 我们共同学习了与这些基因化任务有关的排序任务, 使用编码器解码结构化方法。 在两个公共搜索日志上进行的广泛实验表明, 我们的模型超越了所有现有基线, 并且设计了基因化任务能够实际显示我们各种变换任务的表现。