Automatic question answering is an important yet challenging task in E-commerce given the millions of questions posted by users about the product that they are interested in purchasing. Hence, there is a great demand for automatic answer generation systems that provide quick responses using related information about the product. There are three sources of knowledge available for answering a user posted query, they are reviews, duplicate or similar questions, and specifications. Effectively utilizing these information sources will greatly aid us in answering complex questions. However, there are two main challenges present in exploiting these sources: (i) The presence of irrelevant information and (ii) the presence of ambiguity of sentiment present in reviews and similar questions. Through this work we propose a novel pipeline (MSQAP) that utilizes the rich information present in the aforementioned sources by separately performing relevancy and ambiguity prediction before generating a response. Experimental results show that our relevancy prediction model (BERT-QA) outperforms all other variants and has an improvement of 12.36% in F1 score compared to the BERT-base baseline. Our generation model (T5-QA) outperforms the baselines in all content preservation metrics such as BLEU, ROUGE and has an average improvement of 35.02% in ROUGE and 198.75% in BLEU compared to the highest performing baseline (HSSC-q). Human evaluation of our pipeline shows us that our method has an overall improvement in accuracy of 30.7% over the generation model (T5-QA), resulting in our full pipeline-based approach (MSQAP) providing more accurate answers. To the best of our knowledge, this is the first work in the e-commerce domain that automatically generates natural language answers combining the information present in diverse sources such as specifications, similar questions, and reviews data.
翻译:自动解答是电子商务中一项重要但具有挑战性的任务,因为用户就他们有兴趣购买的产品提出了数以百万计的问题。因此,对自动解答生成系统的需求很大,这些系统利用有关产品的信息提供快速反应。有三种知识来源可以用来回答用户张贴的询问,它们是审查、重复或类似的问题和规格。有效利用这些信息来源将大大有助于我们回答复杂的问题。然而,在利用这些来源方面存在着两个主要挑战:(一) 存在不相关的信息,以及(二) 在审查及类似问题中存在着含混不清的情绪。因此,我们提议建立一个新的管道(MSQAP),利用上述来源的丰富信息,在生成回复之前分别进行高端和模糊的预测。实验结果显示,我们的升级预测模型(BERT-QA)比所有其他变异模型的得分提高了12.36%。我们的新一代模型(T5-QA)比所有内容保存基准(MSQ)要超越了所有内容保存基准(MSQ),因此,在“BEU-O-O-L ” 数据源中,将比为“O-L-L” 数据平均数据,在“BGE-L-L”中,在“30”中将数据中,在“O-L-L-L-L-L-L-L-L”数据库中,在“数据中,在“O-L-L-S-L-L-L-L-L-L-L-L-L-S-L-L-L-L-S-S-L-S-S-S-L-L-L-L-S-L-L-L-L-L-L-L-L-L-L-L-L-S-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L-L