Twist 解码:不同发电机相互指南 (Twist Decoding: Diverse Generators Guide Each Other)

Natural language generation technology has recently seen remarkable progress with large-scale training, and many natural language applications are now built upon a wide range of generation models. Combining diverse models may lead to further progress, but conventional ensembling (e.g., shallow fusion) requires that they share vocabulary/tokenization schemes. We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models. Our method does not assume the vocabulary, tokenization or even generation order is shared. Our extensive evaluations on machine translation and scientific paper summarization demonstrate that Twist decoding substantially outperforms each model decoded in isolation over various scenarios, including cases where domain-specific and general-purpose models are both available. Twist decoding also consistently outperforms the popular reranking heuristic where output candidates from one model is rescored by another. We hope that our work will encourage researchers and practitioners to examine generation models collectively, not just independently, and to seek out models with complementary strengths to the currently available models.

翻译：最近,自然语言生成技术在大规模培训方面取得了显著的进步,许多自然语言应用现在都建立在一系列广泛的代际模式之上。将多种模式结合起来可能会带来进一步的进展,但常规组合(例如浅质聚合)要求它们共享词汇/感化方法。我们引入了Twist解码法,这是一种简单和一般的推论算法,既生成文本,又从多种模式中获益。我们的方法并不包含词汇、代号甚至代代号顺序。我们对机器翻译和科学纸质总结的广泛评价表明,Twist解码大大超越了在各种情景下独立解码的每一种模型,包括有特定领域和通用模型的模型。 Twist解码法也一贯地超越了流行的重新排位超能力,因为一个模型的产出候选人在另一个模型中被重新标注。我们希望我们的工作将鼓励研究人员和从业人员集体地、而不仅仅是独立地研究代号模型,并寻找与现有模型互补的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/