In comparison to the interpretation of classification models, the explanation of sequence generation models is also an important problem, however it has seen little attention. In this work, we study model-agnostic explanations of a representative text generation task -- dialogue response generation. Dialog response generation is challenging with its open-ended sentences and multiple acceptable responses. To gain insights into the reasoning process of a generation model, we propose a new method, local explanation of response generation (LERG) that regards the explanations as the mutual interaction of segments in input and output sentences. LERG views the sequence prediction as uncertainty estimation of a human response and then creates explanations by perturbing the input and calculating the certainty change over the human response. We show that LERG adheres to desired properties of explanations for text generation including unbiased approximation, consistency and cause identification. Empirically, our results show that our method consistently improves other widely used methods on proposed automatic- and human- evaluation metrics for this new task by 4.4-12.8%. Our analysis demonstrates that LERG can extract both explicit and implicit relations between input and output segments.
翻译:与分类模型的解释相比,序列生成模型的解释也是一个重要问题,但很少引起注意。在这项工作中,我们研究了具有代表性的文本生成任务 -- -- 对话响应生成 -- -- 的模型 -- -- 不可知性解释;对话响应生成具有挑战性,其句子不限,且有多种可接受的响应。为了深入了解生成模型的推理过程,我们提出了一个新方法,即对响应生成的本地解释,将解释视为投入和产出句中各部分的相互作用。LERG认为序列预测是对一种人类反应的不确定性估计,然后通过干扰输入和计算对人类反应的确定性变化来作出解释。我们表明,LERG坚持了对文本生成的解释的预期特性,包括不带偏见、一致性和导致识别。我们的结果很生动地表明,我们的方法始终在改进了用于这一新任务的拟议自动和人类评价指标的其他广泛使用的方法,增加了4.4-12.8%。我们的分析表明,LERG可以提取投入和产出段之间的明确和隐含关系。