The topic of Climate Change (CC) has received limited attention in NLP despite its real world urgency. Activists and policy-makers need NLP tools in order to effectively process the vast and rapidly growing textual data produced on CC. Their utility, however, primarily depends on whether the current state-of-the-art models can generalize across various tasks in the CC domain. In order to address this gap, we introduce Climate Change Benchmark (ClimaBench), a benchmark collection of existing disparate datasets for evaluating model performance across a diverse set of CC NLU tasks systematically. Further, we enhance the benchmark by releasing two large-scale labelled text classification and question-answering datasets curated from publicly available environmental disclosures. Lastly, we provide an analysis of several generic and CC-oriented models answering whether fine-tuning on domain text offers any improvements across these tasks. We hope this work provides a standard assessment tool for research on CC text data.
翻译:尽管气候变化是现实世界的紧迫问题,但气候变化专题在《国家劳工政策》中受到的关注有限,积极分子和决策者需要《国家劳工政策》工具,以便有效处理在《国家劳工政策》上产生的庞大而迅速增长的文本数据。然而,其效用主要取决于目前最先进的模型能否在《国家劳工政策》领域各项任务中一概而论。为了缩小这一差距,我们引入了《气候变化基准》(ClimaBench),这是现有不同数据集的基准集,用以系统地评价《国家劳工政策》各项任务中的示范业绩。此外,我们通过发布两个大型的有标签的文本分类和根据公开环境披露整理的问答数据集,加强了基准。最后,我们对若干通用的、面向《CC》的模型进行分析,说明对域文本的微调是否为这些任务带来任何改进。我们希望这项工作为《CC》文本数据的研究提供一个标准评估工具。