Hugging Face出品:如何快速跟上NLP领域最新技术？

2019 年 5 月 23 日 专知

【导读】Hugging Face团队的大牛们，在博客上分享了自己认为的，能够帮助你了解自然语言处理前沿问题和技术的论文列表，小编将文章翻译成了中文，大家一起来看看吧。

原文地址：

https://medium.com/huggingface/the-best-and-most-current-of-modern-natural-language-processing-5055f409a1d1

作者：Hugging Face

翻译：专知

在过去的两年中，大家都见证了自然语言处理在各种不同任务和应用程序上的飞速进展。这一发展是由于我们改变了以往的NLP研究方法所导致的。

过去的很长一段时间,我们使用pre-trained 的 word embedding 比如word2vec或Glove等初始化神经网络的第一层,然后在它后面，在附上特定于任务的架构,然后以有监督的方式在某一个数据集上训练。

但是，近期，有几项研究表明，我们可以利用非监督(或自监督)信号(如语言建模)，在web规模的数据集上学习分层上下文表示，并将这种预训练结果，转移到下游任务(转移学习)。更令人兴奋的是，这种转变导致了下游应用领域的重大进展，从QA到自然语言推理……

几周前，我的一个朋友决定投身于NLP。他已经有机器学习和深度学习的背景，所以他真诚地问我:“我可以阅读哪些论文来跟上现代NLP的最新趋势?”

这是一个非常好的问题，尤其是当您考虑到NLP会议(以及一般的ML会议)收到的提交论文数量呈指数级增长时:+80%的ACL 2019 VS 2018， +90% ACL 2019 VS 2018，……

我为他整理了这一系列的论文和资源，我也很高兴将这个列表与社区分享，因为我相信它可以对很多人有用。

免责声明:本列表不是详尽的，也不是涵盖NLP中的每个主题(例如，没有关于语义解析、对抗性学习、应用于NLP的强化学习等)。这本书是过去几年/几个月(截止2019年5月)最具影响力的作品之一，主要受我所读内容的影响。

迁移学习 Transfer Learning

Deep contextualized word representations (NAACL 2018)
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer
Universal Language Model Fine-tuning for Text Classification (ACL 2018)
Jeremy Howard, Sebastian Ruder
Improving Language Understanding by Generative Pre-Training
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
Language Models are Unsupervised Multitask Learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (NAACL 2019)
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
Cloze-driven Pretraining of Self-attention Networks (arXiv 2019)
Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
Unified Language Model Pre-training for Natural Language Understanding and Generation (arXiv 2019)
Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon
MASS: Masked Sequence to Sequence Pre-training for Language Generation (ICML 2019)
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu

表示学习 Representation Learning

What you can cram into a single vector: Probing sentence embeddings for linguistic properties (ACL 2018)
Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
No Training Required: Exploring Random Encoders for Sentence Classification (ICLR 2019)
John Wieting, Douwe Kiela
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (ICLR 2019)
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems (arXiv 2019)
Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
Linguistic Knowledge and Transferability of Contextual Representations(NAACL 2019)
Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks (arXiv 2019)
Matthew Peters, Sebastian Ruder, Noah A. Smith

神经网络对话 Neural Dialogue

A Neural Conversational Model (ICML Deep Learning Workshop 2015)
Oriol Vinyals, Quoc Le
A Persona-Based Neural Conversation Model (ACL 2016)
Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, Bill Dolan
A Simple, Fast Diverse Decoding Algorithm for Neural Generation (arXiv 2017)
Jiwei Li, Will Monroe, Dan Jurafsky
Neural Approaches to Conversational AI (arXiv 2018)
Jianfeng Gao, Michel Galley, Lihong Li
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents (NeurIPS 2018 CAI Workshop)
Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
Wizard of Wikipedia: Knowledge-Powered Conversational agents (ICLR 2019)
Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston
Learning to Speak and Act in a Fantasy Text Adventure Game (arXiv 2019)
Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston

各种工具 Various picks

Pointer Networks (NIPS 2015)
Oriol Vinyals, Meire Fortunato, Navdeep Jaitly
End-To-End Memory Networks (NIPS 2015)
Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus
Get To The Point: Summarization with Pointer-Generator Networks (ACL 2017)
Abigail See, Peter J. Liu, Christopher D. Manning
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data (EMNLP 2017)
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes
End-to-end Neural Coreference Resolution (EMNLP 2017)
Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer
StarSpace: Embed All The Things! (AAAI 2018)
Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston
The Natural Language Decathlon: Multitask Learning as Question Answering (arXiv 2018)
Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
Character-Level Language Modeling with Deeper Self-Attention (arXiv 2018)
Rami Al-Rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
Linguistically-Informed Self-Attention for Semantic Role Labeling (EMNLP 2018)
Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum
Phrase-Based & Neural Unsupervised Machine Translation (EMNLP 2018)
Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc’Aurelio Ranzato
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning (ICLR 2018)
Sandeep Subramanian, Adam Trischler, Yoshua Bengio, Christopher J Pal
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (arXiv 2019)
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov
Universal Transformers (ICLR 2019)
Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models (NAACL 2019)
Alexandra Chronopoulou, Christos Baziotis, Alexandros Potamianos

对于一些老的文章，引用数量通常是一个好的指标。

通用资料 General resources

书籍 Books

Speech and Language Processing (3rd ed. draft)
Dan Jurafsky and James H. Martin
Neural Network Methods for Natural Language Processing
Yoav Goldberg

课程 Course materials

Natural Language Understanding and Computational Semantics with Katharina Kann and Sam Bowman at NYU
CS224n: Natural Language Processing with Deep Learning with Chris Manning and Abigail See at Standford
Contextual Word Representations: A Contextual Introduction from Noah Smith’s teaching material at UW

博客 Blogs/podcasts