社会科学文本分析变革者神经转移学习入门 (Introduction to Neural Transfer Learning with Transformers for Social Science Text Analysis) - 专知论文

会员服务 ·

0

Learning · 迁移学习 · Analysis · 训练数据 · MoDELS ·

2022 年 8 月 31 日

Introduction to Neural Transfer Learning with Transformers for Social Science Text Analysis

翻译：社会科学文本分析变革者神经转移学习入门

Sandra Wankmüller

from arxiv, 80 pages, 12 figures; changed the title; more focused presentation of contents; moved contents to the appendix; created a new Figure 9; discussion of additional aspects (zero-shot learning, cross-lingual learning, interpretability, foundation models); removed old Figures 4 and 5; made non-essential changes to Figures 1, 2, 4, 6, 7, 8 and 10; changed notation. The original results are unchanged

Transformer-based models for transfer learning have the potential to achieve high prediction accuracies on text-based supervised learning tasks with relatively few training data instances. These models are thus likely to benefit social scientists that seek to have as accurate as possible text-based measures but only have limited resources for annotating training data. To enable social scientists to leverage these potential benefits for their research, this paper explains how these methods work, why they might be advantageous, and what their limitations are. Additionally, three Transformer-based models for transfer learning, BERT (Devlin et al. 2019), RoBERTa (Liu et al. 2019), and the Longformer (Beltagy et al. 2020), are compared to conventional machine learning algorithms on three applications. Across all evaluated tasks, textual styles, and training data set sizes, the conventional models are consistently outperformed by transfer learning with Transformers, thereby demonstrating the benefits these models can bring to text-based social science research.

翻译：以变换器为基础的转让学习模型有可能在基于文本的监督下学习任务上实现高预测值,而培训数据实例相对较少,因此这些模型有可能使社会科学家受益,他们寻求尽可能精确的基于文本的措施,但用于说明培训数据的资源有限。为了使社会科学家能够利用这些潜在的好处来进行研究,本文件解释了这些方法如何发挥作用,为什么它们可能有利,以及它们的局限性。此外,三个基于变换器的转让学习模型,即BERT(Devlin等人,2019年)、RoBERTA(Liu等人,2019年)和Longexu(Beltaty等人,2020年)都与三种应用的传统机器学习算法进行了比较。在所有被评估的任务、文字风格和培训数据集大小中,传统模型始终比与变换器的学习要好,从而展示这些模型能够给基于文本的社会科学研究带来的益处。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

JAK/STAT/SOCS3信号通路介导卒中后认知损害的分子机制及中医“活血通络”法对其的干预研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

Par-4在hTERT非端粒酶活性依赖抗凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Tudor-SN蛋白调控细胞周期的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于社交网络模型的视频分享关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

天基预警雷达STAP-TBD关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

自然与人类胁迫作用下河湖能力衰减生态响应机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

基于双语文档反馈的跨语言信息检索研究

国家自然科学基金

0+阅读 · 2008年12月31日

Revisiting Contextual Toxicity Detection in Conversations

Arxiv

0+阅读 · 2022年10月18日

Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

Arxiv

0+阅读 · 2022年10月18日

Artificial Intelligence for the Metaverse: A Survey

Arxiv

31+阅读 · 2022年2月15日

Sequential Recommendation with Graph Neural Networks

Arxiv

15+阅读 · 2021年6月27日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

Graph Neural Networks for Social Recommendation

Arxiv

20+阅读 · 2019年11月23日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Revisiting Contextual Toxicity Detection in Conversations

Arxiv

0+阅读 · 2022年10月18日

Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

Arxiv

0+阅读 · 2022年10月18日

Artificial Intelligence for the Metaverse: A Survey

Arxiv

31+阅读 · 2022年2月15日

Sequential Recommendation with Graph Neural Networks

Arxiv

15+阅读 · 2021年6月27日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

Graph Neural Networks for Social Recommendation

Arxiv

20+阅读 · 2019年11月23日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

相关基金

JAK/STAT/SOCS3信号通路介导卒中后认知损害的分子机制及中医“活血通络”法对其的干预研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

Par-4在hTERT非端粒酶活性依赖抗凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Tudor-SN蛋白调控细胞周期的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于社交网络模型的视频分享关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

天基预警雷达STAP-TBD关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

自然与人类胁迫作用下河湖能力衰减生态响应机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

基于双语文档反馈的跨语言信息检索研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员