Query-aware webpage snippet extraction is widely used in search engines to help users better understand the content of the returned webpages before clicking. Although important, it is very rarely studied. In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE, aiming to select a few sentences which can best summarize the webpage content in the context of input query. DeepQSE first learns query-aware sentence representations for each sentence to capture the fine-grained relevance between query and sentence, and then learns document-aware query-sentence relevance representations for snippet extraction. Since the query and each sentence are jointly modeled in DeepQSE, its online inference may be slow. Thus, we further propose an efficient version of DeepQSE, named Efficient-DeepQSE, which can significantly improve the inference speed of DeepQSE without affecting its performance. The core idea of Efficient-DeepQSE is to decompose the query-aware snippet extraction task into two stages, i.e., a coarse-grained candidate sentence selection stage where sentence representations can be cached, and a fine-grained relevance modeling stage. Experiments on two real-world datasets validate the effectiveness and efficiency of our methods.
翻译:Query- aware 网页剪贴板的提取被广泛用于搜索引擎, 以帮助用户更好地了解点击之前返回的网页内容。 虽然重要, 但很少研究。 在本文中, 我们提议一个名为 DeepQSE 的有效查询网页剪贴板提取方法, 目的是选择几个能最好地在输入查询中总结网页内容的句子。 深QSE 首先学习每个句子的有查询识别的句子表达方式, 以捕捉查询和判刑之间的细微关联, 然后学习文档识别的断块提取相关表达方式。 由于查询和每个句子都是在深QSE中联合建模的, 其在线引用可能很慢 。 因此, 我们进一步提议一个高效的 DeepQSEE 版本, 名为“ 高效- Eep QSE ”, 目的是在不影响其性能的情况下大幅提高深QSEep 的引用速度。 高效的QSEeep QSE 的核心理念是将调自识读断断断断断断调的断块的剪贴图, 。 即精确的候选判决选择阶段, 和精确的实验性世界判的精选取结果。