Natural language modeling with limited training data is a challenging problem, and many algorithms make use of large-scale pretrained language models (PLMs) for this due to its great generalization ability. Among them, additive learning that incorporates a task-specific adapter on top of the fixed large-scale PLM has been popularly used in the few-shot setting. However, this added adapter is still easy to disregard the knowledge of the PLM especially for few-shot natural language generation (NLG) since an entire sequence is usually generated by only the newly trained adapter. Therefore, in this work, we develop a novel additive learning algorithm based on reinforcement learning (RL) that selectively outputs language tokens between the task-general PLM and the task-specific adapter during both training and inference. This output token selection over the two generators allows the adapter to take into account solely the task-relevant parts in sequence generation, and therefore makes it more robust to overfitting as well as more stable in RL training. In addition, to obtain the complementary adapter from the PLM for each few-shot task, we exploit a separate selecting module that is also simultaneously trained using RL. Experimental results on various few-shot NLG tasks including question answering, data-to-text generation and text summarization demonstrate that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
翻译:以有限的培训数据建模自然语言是一个具有挑战性的问题,许多算法之所以为此使用大规模预先培训的语言模型(PLM),是因为它具有巨大的概括性能力。其中,在固定的大型PLM上方,在固定的大型PLM上方使用一个任务专用适配器的添加式学习,在几发环境下被广泛使用。然而,这种添加式适应器仍然很容易忽视PLM的知识,特别是低镜头自然语言生成(NLG)的知识,因为整个序列通常只由新培训的适应器生成。因此,在这项工作中,我们开发了一个新型添加式学习算法,其基础是强化学习(RL),在任务一般PLM与任务特定调整器之间有选择的输出语言符号,在培训和推断期间,在任务中,在任务-总任务-PL和任务特定调整器之间有选择的输出语言符号。在两个发电机上,这种输出符号选择使适应器能够仅考虑与任务相关的部分,从而更有力地在RL培训中过度和更加稳定。此外,为了每项任务获得PLM的辅助调整式学习,我们利用一个单独选择模块的模块,同时选择一个模块,该模块,该模块,该模块将用前的复制成成成正版的模版的版本,用于演示成正版。