如何在2019年变成NLP专家

会员服务 ·

如何在2019年变成NLP专家

2019 年 5 月 18 日 专知

【导读】本文介绍了近期自然语言处理的一些论文，代码，博客及研究趋势等。

Fastai

Lesson 4 Practical Deep Learning for Coders

https://course.fast.ai/videos/?lesson

它会教你在fastai，语言模型是如何实现的。

LSTM:

即使transfomer更为流行，你还是有必要学习一些LSTM相关的知识，因为在某些时候你仍然可以使用它，并且它是第一个在序列数据上取得较好较好效果的模型。

LSTM原始论文

https://www.bioinf.jku.at/publications/older/2604.pdf

详细解释了LSTM 模型的博客：Understanding LSTM Networks blog

https://colah.github.io/posts/2015-08-Understanding-LSTMs

AWD_LSTM

在LSTM的基础上增加了dropout等，克服原始LSTM的缺点。

论文：

https://arxiv.org/pdf/1708.02182.pdf

Salesforce 官方实现：

https://github.com/salesforce/awd-lstm-lm

fastai 实现：

https://github.com/fastai/fastai/blob/master/fastai/text/models/awd_lstm.py

Pointer模型

论文：

https://arxiv.org/pdf/1609.07843.pdf

官方视频介绍：

https://www.youtube.com/watch?v=Ibt8ZpbX3D8

Improving Neural Language Models with a continuous cache论文：

https://openreview.net/pdf?id=B14E5qee

Attention

只要记得 Attention is not all you need.

CS224n 视频从 1:00:55 开始，解释了attention.

https://www.youtube.com/watch?v=XXtpJxZBa2c

Attention is all you need 论文，同时提出了transformer。

https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

官方视频介绍

https://www.youtube.com/watch?v=rBCqOTEfxvg

谷歌博客：

https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html

另一版本的transformer：Transformer-XL: Attentive Language Models Beyond a Fixed Length Contex paper

https://arxiv.org/pdf/1901.02860.pdf

谷歌官方博客Transformer-XL

https://ai.googleblog.com/2019/01/transformer-xl-unleashing-potential-of.html

Transformer-XL — Combining Transformers and RNNs Into a State-of-the-art Language Model

https://www.lyrn.ai/2019/01/16/transformer-xl-sota-language-model

Attention and Memory in Deep Learning and NLP blog

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp

Attention and Augmented Recurrent Neural Networks blog

https://distill.pub/2016/augmented-rnns

Building the Mighty Transformer for Sequence Tagging in PyTorch: Part 1 blog

https://medium.com/@kolloldas/building-the-mighty-transformer-for-sequence-tagging-in-pytorch-part-i-a1815655cd8

Building the Mighty Transformer for Sequence Tagging in PyTorch: Part 2 blog

https://medium.com/@kolloldas/building-the-mighty-transformer-for-sequence-tagging-in-pytorch-part-ii-c85bf8fd145

多任务学习

An overview of Multi-Task Learning in deep neural networks

https://arxiv.org/pdf/1706.05098.pdf

The Natural Language Decathlon: Multitask Learning as Question Answering

https://arxiv.org/abs/1806.08730

Multi-Task Deep Neural Networks for Natural Language Understanding

https://arxiv.org/pdf/1901.11504.pdf

PyTorch

Pytorch 处理文本的教程

https://pytorch.org/tutorials/#text

最近的进展在

http://ruder.io/nlp-imagenet

ELMo

Deep Contextualized word representations论文

https://arxiv.org/abs/1802.05365

视频介绍：

https://vimeo.com/277672840

ULMFit:

Universal Language Model Fine-tuning for Text Classification论文：

https://arxiv.org/abs/1801.06146

Jeremy Howard 的博客

http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html

OpenAI GPT

GPT1 论文：

https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

博客：

https://openai.com/blog/language-unsupervised

代码：

https://github.com/openai/finetune-transformer-lm

GPT2论文：

https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

博客：

https://openai.com/blog/better-language-models

代码：

https://github.com/openai/gpt-2

GPT2 视频：

https://www.youtube.com/watch?v=T0I88NhR9M

BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding论文：

https://arxiv.org/abs/1810.04805

谷歌官方博客：

https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html

Dissecting BERT Part 1: The Encoder 博客

https://medium.com/dissecting-bert/dissecting-bert-part-1-d3c3d495cdb3

Understading BERT Part 2: BERT Specifics 博客

https://medium.com/dissecting-bert/dissecting-bert-part2-335ff2ed9c73

Dissecting BERT Appendix: The Decoder博客：

https://medium.com/dissecting-bert/dissecting-bert-appendix-the-decoder-3b86f66b0e5f

原文链接：

https://medium.com/@kushajreal/how-to-become-an-expert-in-nlp-in-2019-1-945f4e9073c0

-END-

专 · 知

专知，专业可信的人工智能知识分发，让认知协作更快更好！欢迎登录www.zhuanzhi.ai，注册登录专知，获取更多AI知识资料！

欢迎微信扫一扫加入专知人工智能知识星球群，获取最新AI专业干货知识教程视频资料和与专家交流咨询！

请加专知小助手微信（扫一扫如下二维码添加），加入专知人工智能主题群，咨询技术商务合作~

专知《深度学习:算法到实战》课程全部完成！540+位同学在学习，现在报名，限时优惠！网易云课堂人工智能畅销榜首位！

点击“阅读原文”，了解报名专知《深度学习:算法到实战》课程

登录查看更多

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

【DeepMind硬核课】深度学习自然语言处理前沿进展，附103页ppt

专知会员服务

135+阅读 · 2020年6月28日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

如何读一个成功的机器学习博士？这份ICLR2020指南为你指点迷津

专知会员服务

95+阅读 · 2020年5月2日

深度学习自然语言处理概述，216页ppt，Jindřich Helcl

专知会员服务

216+阅读 · 2020年4月26日