Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses. Attempts to boost informativeness alone come at the expense of factual accuracy, as attested by pretrained language models' propensity to "hallucinate" facts. While this may be mitigated by access to background knowledge, there is scant guarantee of relevance and informativeness in generated responses. We propose a framework that we call controllable grounded response generation (CGRG), in which lexical control phrases are either provided by a user or automatically extracted by a control phrase predictor from dialogue context and grounding knowledge. Quantitative and qualitative results show that, using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.
翻译:目前的终端到终端神经对话模式在反应生成过程中本来就缺乏施加语义控制的灵活性,这往往导致不感兴趣的反应。 单是试图提高信息性就牺牲了事实准确性,正如预先培训的语言模式对“健康”事实的倾向所证明的那样。 虽然通过获取背景知识可以减轻这一点,但生成的响应中的相关性和信息性保障却很少。 我们提出了一个我们称之为可控的基于响应的生成(CGRG)的框架,在这个框架中,词汇控制短语要么由用户提供,要么由控制短语预测器从对话背景和基础知识中自动提取。定量和定性结果显示,利用这个框架,一个基于变压器的模型,其新型感应力关注机制,在像对话一样的重编数据集方面受过培训,它优于强大的生成基线。