Knowledge graph question answering (KGQA) facilitates information access by leveraging structured data without requiring formal query language expertise from the user. Instead, users can express their information needs by simply asking their questions in natural language (NL). Datasets used to train KGQA models that would provide such a service are expensive to construct, both in terms of expert and crowdsourced labor. Typically, crowdsourced labor is used to improve template-based pseudo-natural questions generated from formal queries. However, the resulting datasets often fall short of representing genuinely natural and fluent language. In the present work, we investigate ways to characterize and remedy these shortcomings. We create the IQN-KGQA test collection by sampling questions from existing KGQA datasets and evaluating them with regards to five different aspects of naturalness. Then, the questions are rewritten to improve their fluency. Finally, the performance of existing KGQA models is compared on the original and rewritten versions of the NL questions. We find that some KGQA systems fare worse when presented with more realistic formulations of NL questions. The IQN-KGQA test collection is a resource to help evaluate KGQA systems in a more realistic setting. The construction of this test collection also sheds light on the challenges of constructing large-scale KGQA datasets with genuinely NL questions.
翻译:知识图解解答( KGQA ), 通过利用结构化数据而无需用户的正式询问语言专长, 方便信息获取。 相反, 用户可以通过简单用自然语言提问来表达信息需求。 用于培训提供这种服务的KGQA模型的数据集在专家和众源劳动力两方面都非常昂贵。 通常, 众源劳动力被用来改进基于模板的假自然问题, 从而改进由正式查询产生的假问题。 然而, 由此产生的数据集往往不能真正代表自然语言和流畅的语言。 在目前的工作中, 我们调查如何辨别和纠正这些缺陷。 我们从现有的 KGQA 数据集中抽样问题, 创建 IQN- KGQA 测试收集 IQN- KGQA 测试集, 然后再重写这些问题, 以提高其流畅度。 最后, 将现有的KGQA 模型的性能与 NL 问题的原始和重写版本相比较。 我们发现, 当提出更符合现实的 NL 问题时, 我们研究如何辨别和补救这些缺陷。 与现实的KGQA 收集大规模的KGA 测试系统相比, 也用更精确的KGA 测试系统来评估了这个数据库的大规模的系统。