Task agnostic generative pretraining (GPT) has recently proved promising for zero- and few-shot learning, gradually diverting attention from the expensive supervised learning paradigm. Although the community is accumulating knowledge as to capabilities of English-language autoregressive models such as GPT-3 adopting this generative approach, scholarship about these models remains acutely Anglocentric. Consequently, the community currently has serious gaps in its understanding of this class of models, their potential, and their societal impacts in diverse settings, linguistic traditions, and cultures. To alleviate this issue for Arabic, a collection of diverse languages and language varieties with more than $400$ million population, we introduce JASMINE, a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-13 billion parameters. We pretrain our new models with large amounts of diverse data (400GB of text) from different Arabic varieties and domains. We evaluate JASMINE extensively in both intrinsic and extrinsic settings, using a comprehensive benchmark for zero- and few-shot learning across a wide range of NLP tasks. We also carefully develop and release a novel benchmark for both automated and human evaluation of Arabic autoregressive models focused at investigating potential social biases, harms, and toxicity in these models. We aim to responsibly release our models with interested researchers, along with code for experimenting with them
翻译:虽然社区正在积累有关英语自动递减模型(如GPT-3采用这种基因化方法)能力的知识,但有关这些模型的奖学金仍然非常以英国为中心,因此,社区目前在理解这一类模型、其潜力及其在不同环境、语言传统和文化中的社会影响方面存在严重差距。为了缓解阿拉伯语的这一问题,它汇集了各种语言和语言品种,拥有4亿多美元的人口,我们引进了一套强大的阿拉伯语自动递增变变异语言模型,规模在3亿至130亿参数之间。我们预设了我们的新模型,从不同的阿拉伯品种和领域获得大量不同数据(400GB文本)。我们广泛评价JASMINE的内在和外部环境,利用零和几发综合基准,在广泛的NLP任务中学习。我们还仔细制定和公布一个新型基准,用于自动和人类自动递增变变变变变变变变变变变变变变变变变变变变变语言模型,同时以具有信心的阿拉伯风险模型为研究,并有弹性地研究这些风险的阿拉伯风险分析模型。