In contrast to traditional exhaustive search, selective search first clusters documents into several groups before all the documents are searched exhaustively by a query, to limit the search executed within one group or only a few groups. Selective search is designed to reduce the latency and computation in modern large-scale search systems. In this study, we propose MICO, a Mutual Information CO-training framework for selective search with minimal supervision using the search logs. After training, MICO does not only cluster the documents, but also routes unseen queries to the relevant clusters for efficient retrieval. In our empirical experiments, MICO significantly improves the performance on multiple metrics of selective search and outperforms a number of existing competitive baselines.
翻译:与传统的详尽无遗的搜索不同,在对所有文件进行彻底搜索之前,有选择地首先将文件分组分为若干组,然后通过查询进行彻底搜索,以限制在一个组或几个组内进行的搜索;有选择的搜索旨在减少现代大规模搜索系统中的潜伏和计算;在这项研究中,我们建议MICO为选择性搜索建立一个相互信息CO培训框架,使用搜索日志进行最低限度的监督;在培训之后,MICO不仅将文件分组,而且还将秘密查询引导到相关的组内,以便有效检索;在我们的经验实验中,MICO大大改进了选择性搜索的多重计量的性能,并超越了一些现有的竞争性基线。