Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available. In response, we present Macaw, a versatile, generative question-answering (QA) system that we are making available to the community. Macaw is built on UnifiedQA, itself built on T5, and exhibits strong performance, zero-shot, on a wide variety of topics, including outperforming GPT-3 by over 10% (absolute) on Challenge300, a suite of 300 challenge questions, despite being an order of magnitude smaller (11 billion vs. 175 billion parameters). In addition, Macaw allows different permutations ("angles") of its inputs and outputs to be used, for example Macaw can take a question and produce an answer; or take an answer and produce a question; or take an answer and question, and produce multiple-choice options. We describe the system, and illustrate a variety of question types where it produces surprisingly good answers, well outside the training setup. We also identify question classes where it still appears to struggle, offering insights into the limitations of pretrained language models. Macaw is freely available, and we hope that it proves useful to the community. Macaw is available at https://github.com/allenai/macaw
翻译:尽管经过预先培训的语言模式取得了成功,但仍然很少有高质量的通用质量保证系统可以免费获得。作为回应,我们向社区展示了马考(Macaw)这个我们向社区提供的多功能、基因化的问答(QA)系统。Macaw建在UninialQA上,它以T5为基础,它本身就建立在UnialQA上,在一系列广泛的专题上表现强劲,表现优异,包括以10%(绝对)超过GPT-3(GPT-3)在Challenge300上,一组300个挑战问题,尽管规模较小(110亿比7,700亿参数)。此外,Macaw允许使用其投入和产出的不同变形(“三角”),例如,Macaw可以回答问题和提出答案;或者回答和提出一个问题;或者回答和提问,并产生多种选择。我们描述这个系统,并展示出一系列问题类型,它产生出令人惊讶的好答案,在培训设置之外。我们还确定了一些问题类,我们似乎还在挣扎,对预先语言模型的局限性提供洞察。Mas/coms可以自由获得。MAC。Mas/Macaws。