We investigate the dynamics of increasing the number of model parameters versus the number of labeled examples across a wide variety of tasks. Our exploration reveals that while scaling parameters consistently yields performance improvements, the contribution of additional examples highly depends on the task's format. Specifically, in open question answering tasks, enlarging the training set does not improve performance. In contrast, classification, extractive question answering, and multiple choice tasks benefit so much from additional examples that collecting a few hundred examples is often "worth" billions of parameters. We hypothesize that unlike open question answering, which involves recalling specific information, solving strategies for tasks with a more restricted output space transfer across examples, and can therefore be learned with small amounts of labeled data.
翻译:我们调查了增加模型参数数量的动态和在各种任务中增加有标签的例子的数量。我们的探索表明,虽然衡量参数的参数在不断提高绩效,但其他例子的贡献在很大程度上取决于任务的格式。具体地说,在公开回答问题时,扩大培训组并不能改善绩效。相比之下,分类、采掘问题回答和多重选择任务从收集几百个例子的更多例子中大有裨益,这些例子往往“价值”数十亿个参数。我们假想了与公开回答不同的观点,它涉及回顾具体信息,解决不同实例之间产出空间更受限制的任务的战略,因此可以用少量标签数据学习。