We propose a new paradigm for zero-shot learners that is format agnostic, i.e., it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis. Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without any additional training. Our approach converts zero-shot learning into multiple-choice tasks, avoiding problems in commonly used large-scale generative models such as FLAN. It not only adds generalization ability to models but also significantly reduces the number of parameters. Our method shares the merits of efficient training and deployment. Our approach shows state-of-the-art performance on several benchmarks and produces satisfactory results on tasks such as natural language inference and text classification. Our model achieves this success with only 235M parameters, which is substantially smaller than state-of-the-art models with billions of parameters. The code and pre-trained models are available at https://github.com/IDEA-CCNL/Fengshenbang-LM .
翻译:我们为零点学习者提出一种新的模式,这种模式的格式是不可知的,即它符合任何格式,并适用于语言任务清单,例如文本分类、常识推理、参照分辨率和情绪分析。零点学习的目的是对某一任务进行模型培训,以便能够在没有额外培训的情况下处理新的学习任务。我们的方法是将零点学习转换成多重选择任务,避免诸如FLAN等常用的大规模基因化模型中出现问题。它不仅增加了模型的概括能力,而且大大减少了参数的数量。我们的方法分享了高效培训和部署的优点。我们的方法显示在若干基准上最先进的业绩,并在自然语言推理和文本分类等任务上产生令人满意的结果。我们的模型仅用235M参数就取得了这一成功,该参数大大小于具有数十亿参数的州-艺术模型。代码和预先训练模型见https://github.com/IDA-CCNL/Fengshang-LM。