The instruction learning paradigm -- where a model learns to perform new tasks from task descriptions alone -- has become popular in general-purpose model research. The capabilities of large transformer models as instruction learners, however, remain poorly understood. We use a controlled synthetic environment to characterize such capabilities. Specifically, we use the task of deciding whether a given string matches a regular expression (viewed as an instruction) to identify properties of tasks, instructions, and instances that make instruction learning challenging. For instance, we find that our model, a fine-tuned T5-based text2text transformer, struggles with large regular languages, suggesting that less precise instructions are challenging for models. Additionally, instruction executions that require tracking longer contexts of prior steps are also more difficult. We use our findings to systematically construct a challenging instruction learning dataset, which we call Hard RegSet. Fine-tuning on Hard RegSet, our large transformer learns to correctly interpret only 65.6% of test instructions (with at least 90% accuracy), and 11%-24% of the instructions in out-of-distribution generalization settings. We propose Hard RegSet as a challenging instruction learning task, and a controlled environment for studying instruction learning.
翻译:教学学习模式 -- -- 一种模型仅从任务描述中学会执行新任务 -- -- 在通用模型研究中已经变得流行。大型变压器模型作为教学学习者的能力仍然不易理解。我们使用受控合成环境来描述这种能力。具体地说,我们利用任务来决定某一字符串是否匹配一个常规表达(被视为指示),以确定任务、指示和使教学学习具有挑战性的事例的特性。例如,我们发现我们的模型,一个精细调整的T5基于文本的变压器,与大普通语言挣扎,表明不太精确的指示对模型来说具有挑战性。此外,需要跟踪更长期先前步骤的教学执行也比较困难。我们利用我们的调查结果系统构建一个具有挑战性的教学数据集,我们称之为“硬 RegSet ” 。对硬 RegSet 进行微调,我们的大型变压器学会只正确解释65.6%的测试指示(至少90%的精确度)和分配外通用环境中的11%-24%的指示。我们建议硬 RegSet作为具有挑战性的教学任务和受控环境来学习教学。