Current end-to-end retrieval-based dialogue systems are mainly based on Recurrent Neural Networks or Transformers with attention mechanisms. Although promising results have been achieved, these models often suffer from slow inference or huge number of parameters. In this paper, we propose a novel lightweight fully convolutional architecture, called DialogConv, for response selection. DialogConv is exclusively built on top of convolution to extract matching features of context and response. Dialogues are modeled in 3D views, where DialogConv performs convolution operations on embedding view, word view and utterance view to capture richer semantic information from multiple contextual views. On the four benchmark datasets, compared with state-of-the-art baselines, DialogConv is on average about 8.5x smaller in size, and 79.39x and 10.64x faster on CPU and GPU devices, respectively. At the same time, DialogConv achieves the competitive effectiveness of response selection.
翻译:目前端到端的检索对话系统主要基于经常性神经网络或带有关注机制的变异器。虽然取得了可喜的成果,但这些模型往往受到缓慢的推论或大量参数的影响。在本文中,我们提出一个新的轻量级全演结构,称为 Dialog Conv,供选择反应。 Dialog Conv完全建在演进的顶端,以获取相匹配的背景和反应特征。对话建在3D视图中,其中 Dialog Conv通过嵌入视图、文字视图和发音视图进行演进操作,以从多种背景观点获取更丰富的语义信息。在四个基准数据集中,与最新基线相比, Dialog Conv 的大小平均约为8.5x,在CPU和GPU装置上平均为79.39x和10.64x。同时, Dialog Conv则在选择反应时取得竞争性效果。