We introduce the first system towards the novel task of answering complex multisentence recommendation questions in the tourism domain. Our solution uses a pipeline of two modules: question understanding and answering. For question understanding, we define an SQL-like query language that captures the semantic intent of a question; it supports operators like subset, negation, preference and similarity, which are often found in recommendation questions. We train and compare traditional CRFs as well as bidirectional LSTM-based models for converting a question to its semantic representation. We extend these models to a semisupervised setting with partially labeled sequences gathered through crowdsourcing. We find that our best model performs semi-supervised training of BiDiLSTM+CRF with hand-designed features and CCM(Chang et al., 2007) constraints. Finally, in an end to end QA system, our answering component converts our question representation into queries fired on underlying knowledge sources. Our experiments on two different answer corpora demonstrate that our system can significantly outperform baselines with up to 20 pt higher accuracy and 17 pt higher recall.
翻译:我们引入了第一个系统,以完成在旅游领域回答复杂的多重建议问题的新任务。我们的解决方案使用两个模块的管道:问题理解和回答。为了理解问题,我们定义了一种SQL式的查询语言,该语言可以捕捉问题的语义意图;它支持子集、否定、偏好和相似性等操作者,这些常见于建议问题中。我们培训和比较传统通用报告格式以及双向LSTM模型,将问题转换为语义表达方式。我们将这些模型扩展为半监督设置,通过众包收集部分标签序列。我们发现,我们的最佳模型用手工设计的特点和CCM(CHang等人,2007年)的制约,对BiDLSTM+CRF进行了半监督培训。最后,在结束QA系统之前,我们的答复部分将我们的问题表述转换为对基本知识来源的询问。我们在两个不同的回答公司进行的实验表明,我们的系统可以大大超出基线,达到20 pt pt pt pt recregresuration。