社会科学文本分析变换器的神经转移学习 (Neural Transfer Learning with Transformers for Social Science Text Analysis)

During the last years, there have been substantial increases in the prediction performances of natural language processing models on text-based supervised learning tasks. Especially deep learning models that are based on the Transformer architecture (Vaswani et al., 2017) and are used in a transfer learning setting have contributed to this development. As Transformer-based models for transfer learning have the potential to achieve higher prediction accuracies with relatively few training data instances, they are likely to benefit social scientists that seek to have as accurate as possible text-based measures but only have limited resources for annotating training data. To enable social scientists to leverage these potential benefits for their research, this paper explains how these methods work, why they might be advantageous, and what their limitations are. Additionally, three Transformer-based models for transfer learning, BERT (Devlin et al., 2019), RoBERTa (Liu et al., 2019), and the Longformer (Beltagy et al., 2020), are compared to conventional machine learning algorithms on three social science applications. Across all evaluated tasks, textual styles, and training data set sizes, the conventional models are consistently outperformed by transfer learning with Transformer-based models, thereby demonstrating the potential benefits these models can bring to text-based social science research.

翻译：在过去几年中,自然语言处理模型在基于文本的监督下学习任务方面的预测性能大幅提高,特别是基于变异器结构的深层次学习模型(Vaswani等人,2017年)和用于转移学习环境的深层次学习模型为这一发展做出了贡献。由于以变异器为基础的转移学习模型具有实现更高预测度的潜力,而培训数据实例相对较少,因此这些模型很可能有利于社会科学家,这些科学家寻求尽可能精确的基于文本的措施,但用于说明培训数据的资源有限。为了使社会科学家能够将这些潜在惠益用于其研究,本文件解释了这些方法如何发挥作用,为什么它们可能具有优势,以及它们有哪些局限性。此外,三个基于变异器的转移学习模型,即BERT(Devlin等人,2019年)、RoBERTA(Liu等人,2019年)和Longforon(Beltaty等人,2020年),它们与三种社会科学应用的常规机器学习算法相比较。在所有被评估的任务中,基于文字的模型以及培训数据设定的大小中,这些模型能够通过学习方式向这些变异的模型稳步地展示这些社会研究潜力。