We introduce the task of entity-centric query refinement. Given an input query whose answer is a (potentially large) collection of entities, the task output is a small set of query refinements meant to assist the user in efficient domain exploration and entity discovery. We propose a method to create a training dataset for this task. For a given input query, we use an existing knowledge base taxonomy as a source of candidate query refinements, and choose a final set of refinements from among these candidates using a search procedure designed to partition the set of entities answering the input query. We demonstrate that our approach identifies refinement sets which human annotators judge to be interesting, comprehensive, and non-redundant. In addition, we find that a text generation model trained on our newly-constructed dataset is able to offer refinements for novel queries not covered by an existing taxonomy. Our code and data are available at https://github.com/google-research/language/tree/master/language/qresp.
翻译:我们引入了以实体为中心的查询改进任务。 如果一个输入查询的答案是(可能很大)实体的集合, 任务产出是一组小的查询改进, 旨在协助用户高效的域探索和实体发现。 我们提出了为此项任务创建培训数据集的方法。 对于一个特定输入查询, 我们使用现有的知识基础分类法作为候选人查询改进的源头, 并使用一种旨在分隔回答输入查询的一组实体的搜索程序从这些候选人中选择一套最后的改进方法。 我们证明我们的方法确定了一些改进方法, 人类通知员认为这些改进方法有趣、全面和不重复。 此外, 我们还发现, 受过培训的关于我们新构建数据集的文本生成模型能够为现有分类系统没有覆盖的新的查询提供改进。 我们的代码和数据可以在 https://github.com/google-research/lanage/tree/master/language/qresp 。