We study the problem of stock related question answering (StockQA): automatically generating answers to stock related questions, just like professional stock analysts providing action recommendations to stocks upon user's requests. StockQA is quite different from previous QA tasks since (1) the answers in StockQA are natural language sentences (rather than entities or values) and due to the dynamic nature of StockQA, it is scarcely possible to get reasonable answers in an extractive way from the training data; and (2) StockQA requires properly analyzing the relationship between keywords in QA pair and the numerical features of a stock. We propose to address the problem with a memory-augmented encoder-decoder architecture, and integrate different mechanisms of number understanding and generation, which is a critical component of StockQA. We build a large-scale Chinese dataset containing over 180K StockQA instances, based on which various technique combinations are extensively studied and compared. Experimental results show that a hybrid word-character model with separate character components for number processing, achieves the best performance.\footnote{The data is publicly available at \url{http://ai.tencent.com/ailab/nlp/dataset/}.}
翻译:我们研究与库存有关的问题(StockQA):正如专业的库存分析师根据用户的要求向库存提供行动建议一样,我们研究与库存有关的问题(StockQA):像专业库存分析师一样,自动提出与库存有关的问题的答案。 库存质量保证与以往的质量保证任务大不相同,因为(1) 库存质量保证的答案是自然语言句(而不是实体或价值),而且由于库存质量保证的动态性质,很难从培训数据中以采掘方式获得合理的答案;(2) 库存质量保证要求适当分析QA对关键字与库存数字特征之间的关系。我们提议用一个内存编码-编码-代码架构来解决这个问题,并整合不同的数字理解和生成机制,这是库存QA的一个关键组成部分。我们建立了一个大型中国数据集,其中包含180KnockQA实例,在此基础上对各种技术组合进行了广泛研究和比较。实验结果显示,具有数字处理不同字符组成部分的混合词性模型,实现了最佳性能。\fotete{数据在\urdata{ail/ail/abdata{/ail/tencentcentcent.