Large sequence to sequence models for tasks such as Neural Machine Translation (NMT) are usually trained over hundreds of millions of samples. However, training is just the origin of a model's life-cycle. Real-world deployments of models require further behavioral adaptations as new requirements emerge or shortcomings become known. Typically, in the space of model behaviors, behavior deletion requests are addressed through model retrainings whereas model finetuning is done to address behavior addition requests, both procedures being instances of data-based model intervention. In this work, we present a preliminary study investigating rank-one editing as a direct intervention method for behavior deletion requests in encoder-decoder transformer models. We propose four editing tasks for NMT and show that the proposed editing algorithm achieves high efficacy, while requiring only a single instance of positive example to fix an erroneous (negative) model behavior.
翻译:神经机器翻译(NMT)等任务序列模型的大型序列通常在数亿个样本中经过培训。然而,培训只是模型生命周期的起源。当出现新的要求或发现缺陷时,真实世界模式的部署需要进一步的行为调整。通常,在模型行为空间里,通过模式再培训处理删除行为的请求,而模型微调是为了处理行为增加请求,这两种程序都是以数据为基础的模型干预实例。在这项工作中,我们提出了一项初步研究,调查一等编辑作为在编码-脱coder变压器模型中删除行为请求的直接干预方法。我们建议NMT的四项编辑任务,并表明拟议的编辑算法具有很高的效力,同时只需要一个正面的例子来纠正错误的(消极的)模型行为。