To support complex search tasks, where the initial information requirements are complex or may change during the search, a search engine must adapt the information delivery as the user's information requirements evolve. To support this dynamic ranking paradigm effectively, search result ranking must incorporate both the user feedback received, and the information displayed so far. To address this problem, we introduce a novel reinforcement learning-based approach, RLIrank. We first build an adapted reinforcement learning framework to integrate the key components of the dynamic search. Then, we implement a new Learning to Rank (LTR) model for each iteration of the dynamic search, using a recurrent Long Short Term Memory neural network (LSTM), which estimates the gain for each next result, learning from each previously ranked document. To incorporate the user's feedback, we develop a word-embedding variation of the classic Rocchio Algorithm, to help guide the ranking towards the high-value documents. Those innovations enable RLIrank to outperform the previously reported methods from the TREC Dynamic Domain Tracks 2017 and exceed all the methods in 2016 TREC Dynamic Domain after multiple search iterations, advancing the state of the art for dynamic search.
 翻译:为了支持复杂的搜索任务, 在初始信息要求复杂或在搜索过程中可能发生变化的情况下, 搜索引擎必须随着用户信息要求的演变而调整信息发送方式。 为了有效支持这种动态排序模式, 搜索结果排名必须同时包含收到的用户反馈和迄今为止显示的信息。 为了解决这一问题, 我们引入了一种新的强化学习方法, RLIrank 。 我们首先建立一个经调整的强化学习框架, 以整合动态搜索的关键组成部分。 然后, 我们为动态搜索的每一次迭代实施一个新的学习到排序模式, 使用一个经常性的长期短期记忆神经网络( LSTM ) 来估算下一个结果的收益, 从先前排名的文档中学习 。 为了纳入用户的反馈, 我们开发了一个经典 Rocchio Algorithm 的单词组合变体, 来帮助引导对高价值文件的排序。 这些创新使得 RLIrank 能够在多次搜索之后, 超越了2016 TREC 动态搜索艺术状态, 超越了2016 TREC DiriveD Domain的全部方法 。