Creating a data-driven model that is trained on a large dataset of unstructured dialogs is a crucial step in developing Retrieval-based Chatbot systems. This paper presents a Long Short Term Memory (LSTM) based architecture that learns unstructured multi-turn dialogs and provides results on the task of selecting the best response from a collection of given responses. Ubuntu Dialog Corpus Version 2 was used as the corpus for training. We show that our model achieves 0.8%, 1.0% and 0.3% higher accuracy for Recall@1, Recall@2 and Recall@5 respectively than the benchmark model. We also show results on experiments performed by using several similarity functions, model hyper-parameters and word embeddings on the proposed architecture
翻译:创建数据驱动模型,该模型在非结构化对话框的大型数据集方面受过培训,这是开发基于Retrival的查波特系统的关键一步。本文件展示了一个基于长期短期内存(LSTM)的架构,该架构学习了无结构的多方向对话框,并提供了从所给响应的集合中选择最佳响应的任务结果。 Ubuntu Dialog Corpus 版本2 被用作培训的主体。我们显示,我们的模型在回调@1、 Recall@2和Recall@5方面分别达到0.8%、1.0%和0.3%的精确度比基准模型高出0.8%、1.0%和0.3%。我们还展示了通过使用多个相似功能、超参数模型和词嵌入拟议架构而完成的实验结果。