本文收集了自然语言处理中一些测试数据集,以及机器翻译、阅读和问答,序列标注,知识图谱和社会计算,情感分析和文本分类等NLP常见任务里前沿的一些论文。
感谢IsaacChanghau的整理和无私分享,原文地址:
https://github.com/IsaacChanghau/DL-NLP-Readings
自然语言处理数据集
序列标注
· [2002 CoNLL] Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition, [paper], [bibtex], [dataset].
· [2003 CoNLL] Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition, [paper], [bibtex], [dataset].
· [2017 CoNLL] CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, [paper], [bibtex], [homepage].
· [2017 ACL] Cross-lingual Name Tagging and Linking for 282 Languages, [paper], [bibtex], [homepage].
机器阅读与问答
· [2013 EMNLP] MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text, [paper], [bibtex], [homepage], source: [mcobzarenco/mctest].
· [2015 NIPS] CNN/DailyMail: Teaching Machines to Read and Comprehend, [paper], [bibtex], [homepage], sources: [thomasmesnard/DeepMind-Teaching-Machines-to-Read-and-Comprehend].
· [2016 EMNLP] SQuAD 100,000+ Questions for Machine Comprehension of Text, [paper], [bibtex], [homepage].
· [2016 ICLR] bAbI: Towards AI-Complete Question Answering: a Set of Prerequisite Toy Tasks, [paper], [bibtex], [homepage], sources: [facebook/bAbI-tasks].
· [2017 EMNLP] World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions, [paper], [bibtex], [homepage].
· [2017 EMNLP] RACE: Large-scale ReAding Comprehension Dataset From Examinations, [paper], [bibtex], [homepage], sources: [qizhex/RACE_AR_baselines].
· [2017 ACL] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension, [paper], [bibtex] [homepage], sources: [mandarjoshi90/triviaqa].
· [2018 TACL] QAngaroo: Constructing Datasets for Multi-hop Reading Comprehension Across Documents, [paper], [bibtex], [homepage].
· [2018 ICLR] CLOTH: Large-scale Cloze Test Dataset Designed by Teachers, [paper], [bibtex], [homepage], sources: [qizhex/Large-scale-Cloze-Test-Dataset-Designed-by-Teachers].
· [2018 NAACL] MultiRC: Looking Beyond the Surface - A Challenge Set for Reading Comprehension over Multiple Sentences, [paper], [bibtex], [homepage], sources: [CogComp/multirc].
· [2018 EMNLP] HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering, [paper], [bibtex], [attachment], [homepage], sources: [hotpotqa/hotpot].
常识知识库
· [2017 AAAI] ConceptNet 5.5: An Open Multilingual Graph of General Knowledge, [paper], [bibtex], sources: [GitHub page], [commonsense/conceptnet5], [commonsense/conceptnet-numberbatch].
循环神经网络(RNN)
· [2001 PhD Thesis] Long Short-Term Memory in Recurrent Neural Networks, [Gers' Ph.D. Thesis].
· [2014 ArXiv] Recurrent Neural Network Regularization, [paper].
· [2015 ArXiv] Grid Long Short-Term Memory, [paper], sources: [Tensotflow-GridLSTMCell].
· [2016 ArXiv] Visualizing and Understanding Curriculum Learning for Long Short-Term Memory Networks, [paper].
· [2016 ICLR] Visualizing and Understanding Recurrent Networks, [paper].
· Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences, [paper], sources: [Tensorflow-PhasedLSTMCell].
· [2017 ACML] Nested LSTMs, [paper], sources: [hannw/nlstm], [titu1994/Nested-LSTM].
· [2017 ICLR] Variable Computation in Recurrent Neural Networks, [paper].
· [2018 EMNLP] Simple Recurrent Units for Highly Parallelizable Recurrence, [paper], [bibtex], sources: [taolei87/sru].
· [2018 ICLR] Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks, [paper], [homepage], sources: [imatge-upc/skiprnn-2017-telecombcn].
机器翻译
· [2014 SSST] On the properties of neural machine Translation Encoder-Decoder Approaches, [paper].
· [2015 ICLR] Neural Machine Translation by Jointly Learning to Align and Translate, [paper], sources: [lisa-groundhog/GroundHog], [tensorflow/nmt].
· [2015 EMNLP] Effective Approaches to Attention-based Neural Machine Translation, [paper], [HarvardNLP homepage], sources: [dillonalaird/Attention], [tensorflow/nmt].
· [2016 ACL] Neural Machine Translation of Rare Words with Subword Units, [paper], [bibtex], [software], sources: [rsennrich/subword-nmt], [soaxelbrooke/python-bpe].
· [2017 ACL] A Convolutional Encoder Model for Neural Machine Translation, [paper], sources: [facebookresearch/fairseq].
· [2017 NIPS] Attention is All You Need, [paper], [Chinses blog], sources: [Kyubyong/transformer], [jadore801120/attention-is-all-you-need-pytorch], [DongjunLee/transformer-tensorflow].
· [2017 EMNLP] Neural Machine Translation with Word Predictions, [paper].
· [2017 EMNLP] Massive Exploration of Neural Machine Translation Architectures, [paper], [homepage], sources: [google/seq2seq].
· [2017 EMNLP] Efficient Attention using a Fixed-Size Memory Representation, [paper].
· [2018 AMTA] Context Models for OOV Word Translation in Low-Resource Language, [paper].
· [2018 NAACL] Self-Attention with Relative Position Representations, [paper].
· [2018 COLING] Double Path Networks for Sequence to Sequence Learning, [paper].
机器阅读与问答
· [2014 NIPS] Deep Learning for Answer Sentence Selection, [paper], sources: [brmson/Sentence-selection].
· [2014 ACL] Freebase QA: Information Extraction or Semantic Parsing?, [paper].
· [2015 IJCAI] Convolutional Neural Tensor Network Architecture for Community-based Question Answering, [paper], [bibtex], sources: [GauravBh1010tt/DeepLearn], [SongRb/Seq2SeqLearning].
· [2015 NIPS] Pointer Networks, [paper], [blog], sources: [devsisters/pointer-network-tensorflow], [https://github.com/ikostrikov/TensorFlow-Pointer-Networks], [keon/pointer-networks], [pemami4911/neural-combinatorial-rl-pytorch], [shiretzet/PointerNet].
· [2016 ACL] Question Answering on Freebase via Relation Extraction and Textual Evidence, [paper], sources: [syxu828/QuestionAnsweringOverFB].
· [2016 EMNLP] Long Short-Term Memory-Networks for Machine Reading, [paper], sources: [cheng6076/SNLI-attention], [vsitzmann/snli-attention-tensorflow].
· [2016 ICLR] LSTM-based Deep Learning Models for Non-factoid Answer Selection, [paper], sources: [Alan-Lee123/answer-selection], [tambetm/allenAI].
· [2016 ICML] Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, [paper], sources: [DongjunLee/dmn-tensorflow].
· [2016 ACL] A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task, [paper], sources: [danqi/rc-cnn-dailymail].
· [2016 ICML] Dynamic Memory Networks for Visual and Textual Question Answering, [paper], [blog], sources: [therne/dmn-tensorflow], [barronalex/Dynamic-Memory-Networks-in-TensorFlow], [ethancaballero/Improved-Dynamic-Memory-Networks-DMN-plus], [dandelin/Dynamic-memory-networks-plus-Pytorch], [DeepRNN/visual_question_answering].
· [2017 ICLR] Query-Reduction Networks for Question Answering, [paper], [homepage], sources: [uwnlp/qrn].
· [2017 ICLR] Bi-Directional Attention Flow for Machine Comprehension, [paper], [homepage], [demo], sources: [allenai/bi-att-flow].
· [2017 ACL] Learning to Skim Text, [paper], [notes].
· [2017 ACL] R-Net: Machine Reading Comprehension with Self-matching Networks, [paper], [blog], sources: [HKUST-KnowComp/R-Net], [YerevaNN/R-NET-in-Keras], [minsangkim142/R-net].
· [2017 ICLR] Machine Comprehension Using Match-LSTM and Answer Pointer, [paper], sources: [shuohangwang/SeqMatchSeq], [MurtyShikhar/Question-Answering], [InnerPeace-Wu/reading_comprehension-cs224n].
· [2017 EMNLP] Accurate Supervised and Semi-Supervised Machine Reading for Long Documents, [paper], [bibtex].
· [2017 ArXiv] Simple and Effective Multi-Paragraph Reading Comprehension, [paper], sources: [allenai/document-qa].
· [2017 CoNLL] Making Neural QA as Simple as Possible but not Simpler, [paper], [homepage], [github-page], sources: [georgwiese/biomedical-qa].
· [2017 EMNLP] Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension, [paper], sources: [davidgolub/QuestionGeneration].
· [2017 ACL] Attention-over-Attention Neural Networks for Reading Comprehension, [paper], sources: [OlavHN/attention-over-attention], [marshmelloX/attention-over-attention].
· [2017 EMNLP] Identifying Where to Focus in Reading Comprehension for Neural Question Generation, [paper], [bibtex].
· [2017 ACL] Improved Neural Relation Detection for Knowledge Base Question Answering, [paper].
· [2017 ACL] An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge, [paper], [homepage], [blog].
· [2017 EMNLP] Learning what to read: Focused machine reading, [paper], [bibtex].
· [2017 ACL] Reading Wikipedia to Answer Open-Domain Questions, [paper], sources: [facebookresearch/DrQA], [hitvoice/DrQA].
· [2018 ICLR] MaskGAN: Better Text Generation via Filling in the ______, [paper].
· [2018 AAAI] Multi-attention Recurrent Network for Human Communication Comprehension, [paper].
· [2018 ArXiv] An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering, [paper].
· [2018 ICLR] FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension, [paper], sources: [exe1023/FusionNet], [momohuang/FusionNet-NLI].
· [2018 NAACL] Contextualized Word Representations for Reading Comprehension, [paper], sources: [shimisalant/CWR].
· [2018 ICLR] QANet: Combing Local Convolution with Global Self-Attention for Reading Comprehension, [paper], sources: [hengruo/QANet-pytorch], [NLPLearn/QANet].
· [2018 ICLR] Neural Speed Reading via Skim-RNN, [paper], sources: [schelotto/Neural_Speed_Reading_via_Skim-RNN_PyTorch].
· [2018 SemEval] Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension, [paper], sources: [intfloat/commonsense-rc].
· [2018 ACL] Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge, [paper].
· [2018 ACL] Stochastic Answer Networks for Machine Reading Comprehension, [paper], [bibtex], sources: [kevinduh/san_mrc].
对话、聊天机器人和NLG系统
包括对话系统、聊天机器人、对话算法、自然语言生成方法等。
· [2013 IEEE] POMDP-based Statistical Spoken Dialogue Systems: a Review, [paper].
· [2014 NIPS] Sequence to Sequence Learning with Neural Networks, [paper], sources: [farizrahman4u/seq2seq], [ma2rten/seq2seq], [JayParks/tf-seq2seq], [macournoyer/neuralconvo].
· [2015 CIKM] A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion, [paper], sources: [sordonia/hred-qs].
· [2015 EMNLP] Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems, [paper], sources: [shawnwun/RNNLG], [hit-computer/SC-LSTM].
· [2015 ArXiv] Attention with Intention for a Neural Network Conversation Model, [paper].
· [2015 ACL] Neural Responding Machine for Short-Text Conversation, [paper].
· [2016 AAAI] Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models, [paper], sources: [suriyadeepan/augmented_seq2seq], [julianser/hed-dlg], [sordonia/hed-dlg], [julianser/hred-latent-piecewise], [julianser/hed-dlg-truncated].
· [2016 ACL] On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems, [paper].
· [2016 EMNLP] Deep Reinforcement Learning for Dialogue Generation, [paper], sources: [liuyuemaicha/Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow].
· [2016 EMNLP] Multi-view Response Selection for Human-Computer Conversation, [paper].
· [2017 ACM] A Survey on Dialogue Systems: Recent Advances and New Frontiers, [paper], sources: [shawnspace/survey-in-dialog-system].
· [2017 EMNLP] Adversarial Learning for Neural Dialogue Generation, [paper], sources: [jiweil/Neural-Dialogue-Generation], [liuyuemaicha/Adversarial-Learning-for-Neural-Dialogue-Generation-in-Tensorflow].
· [2017 ACL] Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots, [paper], sources: [MarkWuNLP/MultiTurnResponseSelection], [krayush07/sequential-match-network].
序列标签( POS、NER、SRL、RE、依存分析、实体链接、标点符号恢复等))
包括词性标注、短语识别、命名实体识别( NER )、语义角色标注( SRL )、标点符号恢复、句子分割、依存分析、关系提取、实体链接等。
词性标注和命名实体识别
· [2010 ACL] On Jointly Recognizing and Aligning Bilingual Named Entities, [paper], [bibtex].
· [2012 CIKM] Joint Bilingual Name Tagging for Parallel Corpora, [paper], [bibtex].
· [2012 Springer] Supervised Sequence Labelling with Recurrent Neural Networks, [Alex Graves's Ph.D. Thesis].
· [2015 ArXiv] Bidirectional LSTM-CRF Models for Sequence Tagging, [paper], [bibtex] [blog], sources: [Hironsan/anago], [guillaumegenthial/sequence_tagging].
· [2015 Cheminformatics] Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization, [paper], [bibtex].
· [2016 ArXiv] Multi-Task Cross-Lingual Sequence Tagging from Scratch, [paper], [bibtex].
· [2016 EMNLP] Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping, [paper], [bibtex].
· [2016 NAACL] Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning, [paper], [bibtex].
· [2016 ICLR] Multi-task Sequence to Sequence Learning, [paper], [bibtex].
· [2016 ACL] Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss, [paper], [bibtex], sources: [bplank/bilstm-aux].
· [2016 ACL] Named Entity Recognition with Bidirectional LSTM-CNNs, [paper], [bibtex], sources: [ThanhChinhBK/Ner-BiLSTM-CNNs].
· [2016 NAACL] Neural Architectures for Named Entity Recognition, [paper], [bibtex], sources: [clab/stack-lstm-ner], [glample/tagger], [marekrei/sequence-labeler].
· [2016 ACL] End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, [paper], [bibtex], sources: [LopezGG/NN_NER_tensorFlow].
· [2017 IJCNLP] Segment-Level Neural Conditional Random Fields for Named Entity Recognition, [paper], [bibtex].
· [2017 IJCNLP] Low-Resource Named Entity Recognition with Cross-Lingual, Character-Level Neural Conditional Random Fields, [paper], [bibtex].
· [2017 WNUT] A Multi-task Approach for Named Entity Recognition in Social Media Data, [paper], [bibtex], sources: [tavo91/NER-WNUT17].
· [2017 ACL] Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection, [paper], [bibtex].
· [2017 RLNLP] Multi-task Domain Adaptation for Sequence Tagging, [paper], [bibtex].
· [2017 EMNLP] Cheap Translation for Cross-Lingual Named Entity Recognition, [paper], [bibtex].
· [2017 ACL] Semi-supervised Multitask Learning for Sequence Labeling, [paper], [bibtex].
· [2017 EMNLP] Part-of-Speech Tagging for Twitter with Adversarial Neural Networks, [paper], [bibtex], sources: [guitaowufeng/TPANN].
· [2017 EMNLP] Fast and Accurate Entity Recognition with Iterated Dilated Convolutions, [paper], [bibtex] sources: [iesl/dilated-cnn-ner].
· [2017 ICLR] Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks, [paper], [bibtex], sources: [kimiyoung/transfer].
· [2017 ArXiv] Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks, [paper], [bibtex], sources: [UKPLab/emnlp2017-bilstm-cnn-crf].
· [2017 EMNLP] Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging, [paper], [bibtex], sources: [UKPLab/emnlp2017-bilstm-cnn-crf].
· [2017 InterSpeech] Label-dependency coding in Simple Recurrent Networks for Spoken Language Understanding, [paper], [bibtex].
· [2017 ACL] Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary, [paper], [bibtex], sources: [mengf1/trpos].
· [2017 EMNLP] Semi-Supervised Structured Prediction with Neural CRF Autoencoder, [paper], [bibtex], sources: [cosmozhang/NCRF-AE].
· [2017 EMNLP] Cross-Lingual Transfer Learning for POS Tagging without Cross-Lingual Resources, [paper], [bibtex].
· [2017 ACL] Semi-supervised Sequence Tagging with Bidirectional Language Models, [paper], [bibtex].
· [2018 LREC] Transfer Learning for Named-Entity Recognition with Neural Networks, [paper], [bibtex], sources: [Franck-Dernoncourt/NeuroNER].
· [2018 ICLR] Deep Active Learning for Named Entity Recognition, [paper], [bibtex].
· [2018 AAAI] Empower Sequence Labeling with Task-Aware Neural Language Model, [paper], [bibtex], sources: [LiyuanLucasLiu/LM-LSTM-CRF].
· [2018 NAACL] Robust Multilingual Part-of-Speech Tagging via Adversarial Training, [paper], [bibtex], sources: [michiyasunaga/pos_adv].
· [2018 ArXiv] Improving Part-of-speech Tagging Via Multi-task Learning and Character-level Word Representations, [paper], [bibtex].
· [2018 NAACL] Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition, [paper], [bibtex], sources: [felixwzh/La-DTL].
· [2018 NAACL] Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens, [paper], [bibtex].
· [2018 ACL] Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings, [paper], [bibtex], sources: [google/meta_tagger].
· [2018 ACL] Named Entity Recognition With Parallel Recurrent Neural Networks, [paper], [bibtex].
· [2018 ACL] Chinese NER Using Lattice LSTM, [paper], [bibtex] sources: [jiesutd/LatticeLSTM].
· [2018 ACL] Hybrid semi-Markov CRF for Neural Sequence Labeling, [paper], [bibtex] sources: [ZhixiuYe/HSCRF-pytorch].
· [2018 ACL] A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling, [paper], [bibtex], sources: [limteng-rpi/mlmt].
· [2018 AAAI] Adversarial Learning for Chinese NER from Crowd Annotations, [paper], [bibtex], sources: [SUDA-HLT/ALCrowd].
· [2018 IJCAI] Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer, [paper], [bibtex], sources: [scir-code/lrner].
· [2018 COLING] Contextual String Embeddings for Sequence Labeling, [paper], [bibtex], sources: [zalandoresearch/flair].
语义角色标注 (SRL)
· [2015 ACL] End-to-end Learning of Semantic Role Labeling using Recurrent Neural Networks, [paper], [bibtex] sources: [sanjaymeena/semantic_role_labeling_deep_learning], [hiroki13/neural-semantic-role-labeler].
· [2016 ACL] Neural Semantic Role Labeling with Dependency Path Embeddings, [paper], [bibtex] sources: [microth/PathLSTM].
· [2017 ACL] Deep Semantic Role Labeling: What Works and Whats Next, [paper], [bibtex], sources: [luheng/deep_srl].
· [2018 AAAI] Deep Semantic Role Labeling with Self-Attention, [paper], [bibtex], sources: [XMUNLP/Tagger].
· [2018 EMNLP] Linguistically-Informed Self-Attention for Semantic Role Labeling, [paper], [Supplemental Material], [bibtex], [author], [slides], [slides w/ notes], sources: [strubell/LISA].
标点符号恢复、句子分割
· [2016 Interspeech] Bidirectional Recurrent Neural Network with Attention Mechanism for Punctuation Restoration, [paper], [bibtex], sources: [ottokart/punctuator2].
· [2017 ICASSP] Sequence-to-Sequence Models for Punctuated Transcription Combing Lexical and Acoustic Features, [paper], [bibtex], sources: [choko/acoustic_punctuation].
· [2017 SLSP] Attentional Parallel RNNs for Generating Punctuation in Transcribed Speech, [paper], [bibtex], [dataset], sources: [alpoktem/punkProse].
· [2017 EACL] Sentence Segmentation in Narrative Transcripts from Neuropsychological Tests using Recurrent Convolutional Neural Networks, [paper], [bibtex].
依存分析
· [2014 EMNLP] A Fast and Accurate Dependency Parser using Neural Networks, [paper], [bibtex] sources: [akjindal53244/dependency_parsing_tf], [ljj314zz/dependency_parsing_tf-master].
· [2016 TACL] Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representation, [paper], [bibtex], sources: [elikip/bist-parser].
· [2017 ICLR] Deep Bi-Affine Attention for Neural Dependency Parsing, [paper], [bibtex] sources: [tdozat/Parser-v1], [tdozat/Parser-v2].
· [2018 ACL] Simpler but More Accurate Semantic Dependency Parsing, [paper], [bibtex].
关系抽取与实体链接
· [2018 ACL] DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction, [paper], [bibtex].
知识图与社会网络表示
包括知识图谱完成/表示、社交网络表示等。
知识图谱补全/表示
· [2013 NIPS] Reasoning With Neural Tensor Networks for Knowledge Base Completion, [paper], sources: [siddharth-agrawal/Neural-Tensor-Network], [dddoss/tensorflow-socher-ntn].
· [2013 NIPS] TransE: Translating Embeddings for Modeling Multi-relational Data, [paper], sources: [thunlp/TensorFlow-TransX].
· [2014 AAAI] TransH: Knowledge Graph Embedding by Translating on Hyperplanes, [paper], sources: [thunlp/TensorFlow-TransX].
· [2015 EMNLP] PTransE: Modeling Relation Paths for Representation Learning of Knowledge Bases, [paper], [homepage], sources: [thunlp/Fast-TransX].
· [2015 AAAI] TransR: Learning Entity and Relation Embeddings for Knowledge Graph Completion, [paper], sources: [thunlp/TensorFlow-TransX].
· [2015 ACL] TransD: Knowledge Graph Embedding via Dynamic Mapping Matrix, [paper], sources: [thunlp/TensorFlow-TransX].
· [2016 AAAI] Knowledge Graph Completion with Adaptive Sparse Transfer Matrix, [paper], sources: [FrankWork/transparse], [thunlp/Fast-TransX].
· [2016 ACL] Commonsense Knowledge Base Completion, [paper], [homepage], sources: [Lorraine333/ACL_CKBC].
· [2017 AKBC] RelNet: End-to-End Modeling of Entities & Relations, [paper], [homepage].
· [2017 EMNLP] Context-Aware Representations for Knowledge Base Relation Extraction, [paper], sources: [UKPLab/emnlp2017-relation-extraction].
· [2018 AAAI] SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings, [paper].
图/社交网络表示学习
· [2014 KDD] DeepWalk: Online Learning of Social Representations, [paper], sources: [phanein/deepwalk].
· [2016 NIPS] Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering, [paper], [bibtex], sources: [mdeff/cnn_graph], [xbresson/spectral_graph_convnets].
· [2018 ICLR] Graph Attention Networks, [paper], [bibtex], sources: [PetarV-/GAT], [Diego999/pyGAT], [danielegrattarola/keras-gat].
情感分析与文本分类
包括情感分析、立场是别和文本分类。
文本、段落和文本分类
· [2014 EMNLP] Convolutional Neural Networks for Sentence Classification, [paper], [bibtex] sources: [yoonkim/CNN_sentence], [dennybritz/cnn-text-classification-tf].
· [2015 ACL] Deep Unordered Composition Rivals Syntactic Methods for Text Classification, [paper], [bibtex], [slides], sources: [miyyer/dan].
· [2015 AAAI] Recurrent Convolutional Neural Networks for Text Classification, [paper], [bibtex], sources: [knok/rcnn-text-classification], [airalcorn2/Recurrent-Convolutional-Neural-Network-Text-Classifier].
· [2016 NAACL] Hierarchical Attention Networks for Document Classification, [paper], [bibtex], sources: [richliao/textClassifier], [ematvey/hierarchical-attention-networks].
· [2017 EACL] Bag of Tricks for Efficient Text Classification, [paper], [bibtex], sources: [facebookresearch/fastText].
· [2017 ArXiv] Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?, [paper], [bibtex], sources: [zhangxiangxiao/glyph].
· [2017 ArXiv] Multi-Task Label Embedding for Text Classification, [paper], [bibtex], [blog].
· [2017 ICLR] Adversarial Training Methods For Semi-Supervised Text Classification, [paper], [bibtex], sources: [TobiasLee/Text-Classification].
· [2017 IJCNLP] PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts, [paper], [bibtex], sources: [Franck-Dernoncourt/pubmed-rct].
· [2017 ACL] Adversarial Multi-task Learning for Text Classification, [paper], [bibtex], sources: [FrankWork/fudan_mtl_reviews].
· [2018 ArXiv] Densely Connected Bidirectional LSTM with Applications to Sentence Classification, [paper], [bibtex], source: [IsaacChanghau/Dense_BiLSTM].
· [2018 NAACL] Multinomial Adversarial Networks for Multi-Domain Text Classification, [paper], [bibtex] sources: [ccsasuke/man].
· [2018 EMNLP] Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts, [paper], [bibtex].
情感分析
· Introduction to Sentiment Analysis, [slides], [blog].
· [2013 EMNLP] Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, [paper], sources: [rksltnl/RNTN], [awni/semantic-rntn], [rgobbel/rntn].
· [2014 ACL] A Convolutional Neural Network for Modelling Sentences, [paper], sources: [hritik25/Dynamic-CNN-for-Modelling-Sentences], [FredericGodin/DynamicCNN].
· [2015 EMNLP] Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-Level Multimodal Sentiment Analysis, [paper].
· [2015 ACL] Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, cover semantic relatedness and sentiment classification tasks. [paper], sources: [stanfordnlp/treelstm], [nicolaspi/treelstm], [sapruash/RecursiveNN], [dallascard/TreeLSTM].
· [2016 EMNLP] A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis, [paper].
· [2016 EMNLP] Attention-based LSTM for Aspect-level Sentiment Classification, [paper], sources: [scaufengyang/TD-LSTM].
· [2016 ICDM] Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis, [paper], sources: [SenticNet/multimodal-sentiment-detection].
· [2017 ICME] Select-additive Learning: Improving Generalization in Multimodal Sentiment Analysis, [paper], sources: [HaohanWang/SelectAdditiveLearning].
· [2017 ICMI] Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning, [paper].
· [2017 ACM SIGIR] Multitask Learning for Fine-Grained Twitter Sentiment Analysis, [paper], sources: [balikasg/sigir2017].
· [2017 EMNLP] Tensor Fusion Network for Multimodal Sentiment Analysis, [paper], sources: [A2Zadeh/TensorFusionNetwork].
· [2017 ACL] Context-Dependent Sentiment Analysis in User-Generated Videos, [paper], sources: [SenticNet/contextual-sentiment-analysis].
· [2018 ACL] Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis, [paper], [data].
· [2018 AAAI] Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM, [paper].
· [2018 Cognitive Computation] Sentic LSTM: a Hybrid Network for Targeted Aspect-Based Sentiment Analysis, [paper], sources: [SenticNet/sentic-lstm].
立场是别
· [2016 ACM] Stance and Sentiment in Tweets, [paper].
· [2016 SemEval] SemEval-2016 Task 6: Detecting Stance in Tweets, [paper], [homepage], [The SemEval-2016 Stance Dataset].
· [2016 SemEval] DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs, [paper].
· [2016 SEM@ACL] Detecting Stance in Tweets And Analyzing its Interaction with Sentiment, [paper], sources: [vishaalmohan/twitter-stance-detection].
· [2018 ECIR] Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention, [paper].
字符/单词Embedding和baseline系统
Character Embeddings
· [2016 AAAI] Char2Vec: Character-Aware Neural Language Models, [paper], sources: [carpedm20/lstm-char-cnn-tensorflow], [yoonkim/lstm-char-cnn].
Word Embeddings
· [2008 NIPS] HLBL: A Scalable Hierarchical Distributed Language Model, [paper], [wenjieguan/Log-bilinear-language-models].
· [2010 INTERSPEECH] RNNLM: Recurrent neural network based language model, [paper], [Ph.D. Thesis], [slides], sources: [mspandit/rnnlm].
· [2013 NIPS] Word2Vec: Distributed Representations of Words and Phrases and their Compositionality, [paper], [word2vec explained], [params explained], [blog], sources: [word2vec], [dav/word2vec], [yandex/faster-rnnlm], [tf-word2vec], [zake7749/word2vec-tutorial].
· [2013 CoNLL] Better Word Representations with Recursive Neural Networks for Morphology, [paper].
· [2014 ACL] Word2Vecf: Dependency-Based Word Embeddings, [paper], [blog], sources: [Yoav Goldberg/word2vecf], [IsaacChanghau/Word2VecfJava].
· [2014 EMNLP] GloVe: Global Vectors for Word Representation, [paper], [homepage], sources: [stanfordnlp/GloVe].
· [2014 ICML] Compositional Morphology for Word Representations and Language Modelling, [paper], sources: [thompsonb/comp-morph], [claravania/subword-lstm-lm].
· [2015 ACL] Hyperword: Improving Distributional Similarity with Lessons Learned from Word Embeddings, [paper], sources: [Omer Levy/hyperwords].
· [2016 ICLR] Exploring the Limits of Language Modeling, [paper], [slides], sources: [tensorflow/models/lm_1b].
· [2016 CoNLL] Context2Vec: Learning Generic Context Embedding with Bidirectional LSTM, [paper], sources: [orenmel/context2vec].
· [2016 IEEE Intelligent Systems] How to Generate a Good Word Embedding?, [paper], [基于神经网络的词和文档语义向量表示方法研究], [blog], sources: [licstar/compare].
· [2016 ArXiv] Linear Algebraic Structure of Word Senses, with Applications to Polysemy, [paper], [slides], sources: [YingyuLiang/SemanticVector].
· [2017 ACL] FastText: Enriching Word Vectors with Subword Information, [paper], sources: [facebookresearch/fastText], [salestock/fastText.py].
· [2017 ArXiv] Implicitly Incorporating Morphological Information into Word Embedding, [paper].
· [2017 AAAI] Improving Word Embeddings with Convolutional Feature Learning and Subword Information, [paper], sources: [ShelsonCao/IWE].
· [2018 ICML] Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations, [paper], supplementary, sources: [chentingpc/kdcode-lm].
· [2018 ICLR] Compressing Word Embeddings via Deep Compositional Code Learning, [paper], [bibtex], sources: [msobroza/compositional_code_learning].
Baseline 系统
· [2017 NIPS] Learned in Translation: Contextualized Word Vectors, [paper], sources: [salesforce/cove].
· [2018 NAACL] Deep contextualized word representations, [paper], [homepage], sources: [allenai/bilm-tf].
· [2018 ArXiv] GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations, [paper], [bibtex].
· [2018 ArXiv] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, [paper], [bibtex], sources: [google-research/bert], [huggingface/pytorch-pretrained-BERT].
句子表示、自然语言推理和摘要
包括句子embedding/表示、自然语言推理、句子匹配、文本蕴涵、文本摘要等。
句子 Embedding / 表示
· [2015 NIPS] Skip Thought Vectors, [paper], [bibtex], sources: [ryankiros/skip-thoughts].
· [2017 ICLR] A Simple But Tough-to-beat Baseline for Sentence Embeddings, [paper], [bibtex], sources: [PrincetonML/SIF].
· [2017 ICLR] A Structured Self-attentive Sentence Embedding, [paper], [bibtex], sources: [ExplorerFreda/Structured-Self-Attentive-Sentence-Embedding], [flrngel/Self-Attentive-tensorflow], [kaushalshetty/Structured-Self-Attention].
· [2017 EMNLP] Supervised Learning of Universal Sentence Representations from Natural Language Inference Data, [paper], [bibtex], sources: [facebookresearch/InferSent].
· [2018 ICLR] Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning, [paper], [bibtex], sources: [Maluuba/gensen].
· [2018 ArXiv] Universal Sentence Encoder, [paper], [bibtex], sources: [TensorFlow Hub/universal-sentence-encoder], [helloeve/universal-sentence-encoder-fine-tune].
· [2018 ArXiv] Evaluation of Sentence Embeddings in Downstream and Linguistic Probing Tasks, [paper], [bibtex].
· [2018 EMNLP] XNLI: Evaluating Cross-lingual Sentence Representations, [paper], [bibtex], sources: [facebookresearch/XNLI].
· [2018 EMNLP] Dynamic Meta-Embeddings for Improved Sentence Representations, [paper], [bibtex], sources: [facebookresearch/DME].
自然语言推理(文本蕴含,句子匹配)
· [2016 NAACL] Learning Natural Language Inference with LSTM, [paper], [bibtex], source: [shuohangwang/SeqMatchSeq].
· [2017 IJCAI] BiMPM: Bilateral Multi-Perspective Matching for Natural Language Sentences, [paper], [bibtex], sources: [zhiguowang/BiMPM].
· [2017 ArXiv] Distance-based Self-Attention Network for Natural Language Inference, [paper], [bibtex].
· [2018 AAAI] DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding, [paper], [bibtex], sources: [taoshen58/DiSAN].
· [2018 IJCAI] Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling, [paper], [bibtex].
文本摘要
· [2017 ACL] Get To The Point: Summarization with Pointer-Generator Networks, [paper], [bibtex], sources: [abisee/pointer-generator], [abisee/cnn-dailymail], [JafferWilson/Process-Data-of-CNN-DailyMail].
· [2018 ICLR] Generating Wikipedia by Summarizing Long Sequences, [paper], [bibtex], sources: [tensorflow/tensor2tensor · wikisum].
可解释性、消歧、回指和语篇
包括可解释性、歧义消除、回指、语篇关系表征等。
可解释性
· [2012 COLING] Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding, [paper], [homepage].
· [2015 NAACL] A Compositional and Interpretable Semantic Space, [paper].
· [2015 EMNLP] Online Learning of Interpretable Word Embeddings, [paper].
· [2015 ACL] SPOWV: Sparse Overcomplete Word Vector Representations, [paper], [mfaruqui/sparse-coding].
· [2016 IJCAI] Sparse Word Embeddings Using l1 Regularized Online Learning, [paper], [slides].
· [2017 ArXiv] Semantic Structure and Interpretability of Word Embeddings, [paper].
· [2017 EMNLP] Rotated Word Vector Representations and their Interpretability, [paper], [poster], sources: [SungjoonPark/factor_rotation], [mvds314/factor_rotation].
· [2018 AAAI] SPINE: SParse Interpretable Neural Embeddings, [paper], sources: [harsh19/SPINE].
· [2018 ACL] Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder, [paper], [bibtex], sources: [tianran/glimvec].
消歧
· [2015 VSM] A Simple Word Embedding Model for Lexical Substitution, [paper], sources: [orenmel/lexsub].
· [2017 EMNLP] Deep Joint Entity Disambiguation with Local Neural Attention, [paper], sources: [dalab/deep-ed].
共指和回指消解
· [2012 EMNLP] Joint Entity and Event Coreference Resolution across Documents, [paper], [bibtex].
· [2016 EMNLP] Deep Reinforcement Learning for Mention-Ranking Coreference Models, [paper], [bibtex], [blog], [demo], sources: [huggingface/neuralcoref], [clarkkev/deep-coref].
· [2016 ACL] Improving Coreference Resolution by Learning Entity-Level Distributed Representations, [paper], [bibtex], sources: [clarkkev/deep-coref].
· [2017 ArXiv] Linguistic Knowledge as Memory for Recurrent Neural Networks, [paper], [bibtex].
语篇关系表征与识别
· [2017 EMNLP] Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification, [paper], [bibtex].
多任务和未报告的研究工作
包括多任务学习、NLP调研、NLP优化方法、语法纠错等。
多任务学习
· [2011 JMLR] Natural Language Processing (Almost) from Scratch, cover Tagging, Chunking, Parsing, NER, SRL and etc.tasks, [paper], [bibtex], sources: [attardi/deepnl].
· [2017 EMNLP] A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks, cover Tagging, Chunking, Parsing, Relatedness, Entailment tasks, [paper], [bibtex], [blog], sources: [rubythonode/joint-many-task-model].
· [2018 ICLR] Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling, [paper], [bibtex], sources: [taoshen58/BiBloSA].
· [2018 CoNLL] Sequence Classification with Human Attention, [paper], [bibtex], sources: [coastalcph/Sequence_classification_with_human_attention].
· [2018 ArXiv] Improving Language Understanding by Generative Pre-Training, [paper], [bibtex], [homepage], sources: [openai/finetune-transformer-lm].
自然语言调研
· [2018 JAIR] Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation, [paper].
· [2018 CIM] Recent Trends in Deep Learning Based Natural Language Processing, [paper].
语法纠错
· [2014 CoNLL] The CoNLL-2014 Shared Task on Grammatical Error Correction, [paper], [bibtex] [homepage].
· [2018 AAAI] A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction, [paper], [bibtex], [nusnlp/mlconvgec2018].
其他
· [2016 EMNLP] How Transferable are Neural Networks in NLP Applications?, [paper].
· [2017 ICML] Language Modeling with Gated Convolutional Networks, [paper], sources: [anantzoid/Language-Modeling-GatedCNN], [jojonki/Gated-Convolutional-Networks].
· [2017 CIKM] Commonsense for Machine Intelligence: Text to Knowledge and Knowledge to Text, [slides], [CIKM 2017 Singapore Tutorials], [Commonsense for Machine Intelligence, Allen Institute, CIKM 2017 TUTORIAL], [Allen Institute].
· [2017 ICLR] An Actor Critic Algorithm for Structured Prediction, [paper], [bibtex], sources: [rizar/actor-critic-public].
· [2017 ACL] Learning When to Skim and When to Read, [paper], [blog].
· [2018 ArXiv] Fast Directional Self-Attention Mechanism, [paper], [bibtex], sources: [taoshen58/Fast-DiSA].
· [2018 ICLR] Regularizing and Optimizing LSTM Language Models, [paper], [bibtex], sources: [salesforce/awd-lstm-lm], author page: [Nitish Shirish Keskar].
DeepLearning_NLP
深度学习与NLP
商务合作请联系微信号:lqfarmerlq