Task semantics can be expressed by a set of input-to-output examples or a piece of textual instruction. Conventional machine learning approaches for natural language processing (NLP) mainly rely on the availability of large-scale sets of task-specific examples. Two issues arise: first, collecting task-specific labeled examples does not apply to scenarios where tasks may be too complicated or costly to annotate, or the system is required to handle a new task immediately; second, this is not user-friendly since end-users are probably more willing to provide task description rather than a set of examples before using the system. Therefore, the community is paying increasing interest in a new supervision-seeking paradigm for NLP: learning from task instructions. Despite its impressive progress, there are some common issues that the community struggles with. This survey paper tries to summarize the current research on instruction learning, particularly, by answering the following questions: (i) what is task instruction, and what instruction types exist? (ii) how to model instructions? (iii) what factors influence and explain the instructions' performance? (iv) what challenges remain in instruction learning? To our knowledge, this is the first comprehensive survey about textual instructions.
翻译:任务语义可以通过一组输入输出示例或一段文本指导语来表达。自然语言处理(NLP)的传统机器学习方法主要依赖于大规模任务特定示例的可用性。这里存在两个问题:首先,收集任务特定的标注样本并不能适用于任务可能太复杂或昂贵难以标注,或者系统要求立即处理新任务的场景;其次,这并不是用户友好的,因为终端用户可能更愿意在使用系统之前提供任务描述而不是一组示例。因此,该社区正越来越重视一种新的NLP监督式范式: 从任务指导中学习。尽管已经取得了令人印象深刻的进展,但该社区仍面临一些共同的问题。本次调查报告试图总结指导学习方面的当前研究,特别是通过回答以下问题: (i)什么是任务指令,有哪些指令类型? (ii)如何建模指令? (iii) 什么因素影响和解释了指令性能? (iv)指导学习面临什么挑战?据我们所知,这是第一份关于文本指令的综合研究。