Recent NLP models have the great ability to generalise `zero-shot' to new tasks using only an instruction as guidance. However, these approaches usually repeat their instructions with every input, requiring costly reprocessing of lengthy instructions for every inference example. To alleviate this, we introduce Hypernetworks for INstruction Tuning (HINT), which convert task instructions and examples using a pretrained text encoder into parameter-efficient modules inserted into an underlying model, eliminating the need to include instructions in the model input. Compared to prior approaches that concatenate instructions with every input instance, we find that HINT models are significantly more compute-efficient and consistently outperform these approaches for a given inference budget.
翻译:最近的NLP模型非常有能力将“零光”概括为仅使用指示作为指导的新任务,然而,这些方法通常在每次输入时重复其指示,要求每个推理实例都用昂贵的后处理长时间指示。为了减轻这一影响,我们引入了超音速测试网络(HINT),将任务指示和示例使用预先训练的文字编码编码器转换为插入基本模型的具有参数效率的模块,从而不必将指示纳入模型输入中。与以往将指示与每个输入实例相连接的做法相比,我们发现 HINT模型的计算效率要高得多,并且一贯地超过这些方法,用于给定的推理预算。