Augmenting the base neural model with a token-level symbolic datastore is a novel generation paradigm and has achieved promising results in machine translation (MT). In this paper, we introduce a unified framework kNN-BOX, which enables quick development and interactive analysis for this novel paradigm. kNN-BOX decomposes the datastore-augmentation approach into three modules: datastore, retriever and combiner, thus putting diverse kNN generation methods into a unified way. Currently, kNN-BOX has provided implementation of seven popular kNN-MT variants, covering research from performance enhancement to efficiency optimization. It is easy for users to reproduce these existing works or customize their own models. Besides, users can interact with their kNN generation systems with kNN-BOX to better understand the underlying inference process in a visualized way. In the experiment section, we apply kNN-BOX for machine translation and three other seq2seq generation tasks, namely, text simplification, paraphrase generation and question generation. Experiment results show that augmenting the base neural model with kNN-BOX leads to a large performance improvement in all these tasks. The code and document of kNN-BOX is available at https://github.com/NJUNLP/knn-box.
翻译:以象征性象征性数据存储器增强基本神经模型是一种新型的一代模式,在机器翻译方面已经取得了有希望的成果。 在本文中,我们引入了一个统一的框架 kNN-BOX, 使得能够对这一新模式进行快速发展和互动分析。 kNN- BOX 将数据储存增强方法分解成三个模块: 数据储存器、 检索器和组合器, 从而将不同的 kNN 生成方法引入一个统一的方式。 目前, kNN- BOX 已经提供了七个广受欢迎的 kNN- MT 变量的实施, 涵盖从提高性能到优化效率的研究。 用户很容易复制这些现有作品或定制自己的模型。 此外, 用户可以与 kNNN- BOX 进行互动, 以便以可视化的方式更好地了解基本的推断过程。 在实验部分, 我们应用 kNNNN- BOX 来进行机器翻译, 以及另外三个后续的生成任务, 即文本简化、 语音生成和问题生成。 实验结果显示, kNNN-BX 将基础神经模型模型模型复制到所有这些任务中的大规模性文件。</s>