Recent works have shown that attaching prompts to the input is effective at conditioning Language Models (LM) to perform specific tasks. However, prompts are always included in the input text during inference, thus incurring substantial computational and memory overhead. Also, there is currently no straightforward method of utilizing prompts that are longer than the maximum input length of the LMs without incurring additional costs during inference. We propose Prompt Injection (PI), a novel formulation of injecting the prompt into the parameters of an LM to be an efficient alternative to attaching fixed prompts to the input. We show that in scenarios with long fixed prompts, PI can be up to 280 times more efficient in terms of total FLOPs than previous approaches. We further explore methodologies for PI and show promising results in persona-dependent conversation, semantic parsing, and zero-shot learning with task instructions. Through these explorations, we show that PI can be a promising direction for conditioning language models, especially in scenarios with long and fixed prompts.
翻译:最近的工作表明,对输入附加提示对于调整语言模型以完成具体任务是有效的,然而,在推断期间输入文本中总是包含提示,从而产生大量的计算和记忆管理;此外,目前没有直接的方法可以使用比LM的最大输入长度更长的提示,而不会在推断期间产生额外费用;我们提议快速注射(PI),这是一种将提示注入LM参数的新配方,是给输入附加固定提示的有效替代。我们表明,在长期固定提示的情况下,PI在总FLOP方面可以达到280倍的效率;我们进一步探索PI的方法,在个人独立对话、语义分隔和零光学方面,用任务指示显示有希望的结果。我们通过这些探讨,我们表明PI可以成为调整语言模型的有希望的方向,特别是在长期和固定提示的情况下。