人文语言建模 (Human Language Modeling) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · Extensibility · Perplexity · 情感分类 ·

2022 年 5 月 10 日

Human Language Modeling

翻译：人文语言建模

Nikita Soni,Matthew Matero,Niranjan Balasubramanian,H. Andrew Schwartz

Natural language is generated by people, yet traditional language modeling views words or documents as if generated independently. Here, we propose human language modeling (HuLM), a hierarchical extension to the language modeling problem whereby a human-level exists to connect sequences of documents (e.g. social media messages) and capture the notion that human language is moderated by changing human states. We introduce, HaRT, a large-scale transformer model for the HuLM task, pre-trained on approximately 100,000 social media users, and demonstrate its effectiveness in terms of both language modeling (perplexity) for social media and fine-tuning for 4 downstream tasks spanning document- and user-levels: stance detection, sentiment classification, age estimation, and personality assessment. Results on all tasks meet or surpass the current state-of-the-art.

翻译：自然语言是由人创造的,但传统语言模拟观点的文字或文件则像独立生成一样由人产生。在这里,我们建议人类语言建模(HuLM),这是语言建模问题的分级延伸,在语言建模问题上,人一级存在将文件序列(例如社交媒体信息)连接起来的观念,并捕捉人类语言通过改变人类国家而调节的观念。我们引入了HaRT,这是人类语言建模任务的一个大型变压器模型,对大约10万社交媒体用户进行了预先培训,在社会媒体的语言建模(易懂性)和对覆盖文件和用户层面的4个下游任务进行微调(姿态检测、情绪分类、年龄估计和个性评估)方面都证明了其有效性:所有任务的结果都达到或超过目前的最新水平。

0

相关内容

语言模型化

语言模型化

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

ECoG,EEG-fMRI多模态癫痫监测与病灶定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

（氧）氮化物光电极太阳能分解水制氢的研究

国家自然科学基金

0+阅读 · 2014年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

海洋生物碱aaptamines及其衍生物的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于自组装的多重靶向抗肿瘤多肽的设计与生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

轻金属硼基氢化物复合材料的制备及储氢性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

转录组的单核苷酸多态性（SNPs）与蜈蚣草砷富集能力的关系

国家自然科学基金

0+阅读 · 2009年12月31日

基于模式生物斑马鱼模型的中药代谢研究新方法的建立

国家自然科学基金

0+阅读 · 2009年12月31日

Is neural language acquisition similar to natural? A chronological probing study

Arxiv

0+阅读 · 2022年7月1日

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

Arxiv

0+阅读 · 2022年7月1日

Human Heuristics for AI-Generated Language Are Flawed

Arxiv

0+阅读 · 2022年6月30日

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language

Arxiv

0+阅读 · 2022年6月30日

Mapping the Design Space of Human-AI Interaction in Text Summarization

Arxiv

0+阅读 · 2022年6月29日

Modeling Teams Performance Using Deep Representational Learning on Graphs

Arxiv

0+阅读 · 2022年6月29日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Is neural language acquisition similar to natural? A chronological probing study

Arxiv

0+阅读 · 2022年7月1日

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

Arxiv

0+阅读 · 2022年7月1日

Human Heuristics for AI-Generated Language Are Flawed

Arxiv

0+阅读 · 2022年6月30日

GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language

Arxiv

0+阅读 · 2022年6月30日

Mapping the Design Space of Human-AI Interaction in Text Summarization

Arxiv

0+阅读 · 2022年6月29日

Modeling Teams Performance Using Deep Representational Learning on Graphs

Arxiv

0+阅读 · 2022年6月29日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

相关基金

ECoG,EEG-fMRI多模态癫痫监测与病灶定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

（氧）氮化物光电极太阳能分解水制氢的研究

国家自然科学基金

0+阅读 · 2014年12月31日

奶牛乳腺脂类合成代谢转录调控机制与基因网络构建

国家自然科学基金

0+阅读 · 2014年12月31日

海洋生物碱aaptamines及其衍生物的设计、合成及生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于自组装的多重靶向抗肿瘤多肽的设计与生物活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

轻金属硼基氢化物复合材料的制备及储氢性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

转录组的单核苷酸多态性（SNPs）与蜈蚣草砷富集能力的关系

国家自然科学基金

0+阅读 · 2009年12月31日

基于模式生物斑马鱼模型的中药代谢研究新方法的建立

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员