推荐一个Github项目:hscspring/All4NLP
All For NLP, especially Chinese
作者是AINLP交流群里的太子長琴同学,整理了自己平时看过用过的NLP相关资源,分类整理的很细致,推荐Star。项目链接,点击阅读原文可以直达:
https://github.com/hscspring/All4NLP
以下来在该项目主页,阅读原文直达相关链接。
每个链接前面的时间是更新时间。
facebookresearch/pytext: A natural language modeling framework based on PyTorch
deeplearning NLP with PyTorch
Text classifiers, Sequence taggers, Joint intent-slot model and Contextual intent-slot models
C++ server example
zalandoresearch/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)
NER, POS, sense disambiguation and classification
on top of PyTorch
stanfordnlp/stanfordnlp: Official Stanford NLP Python Library for Many Human Languages
Java library with Python wrappers
speed, prodcution system use
nltk/nltk: NLTK Source
education and research tool
learning and exploring NLP concepts
sloria/TextBlob: Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
on top of NLTK
fast-prtotyping
applications don't require highly performance
spaCy · Industrial-strength Natural Language Processing in Python
fast
streamlined
production-ready
chartbeat-labs/textacy: NLP, before and after spaCy
rockingdingo/deepnlp: Deep Learning NLP Pipeline implemented on Tensorflow
deeplearning NLP with tensorflow
2018 BenchMark
geek-ai/Texygen: A text generation benchmarking platform
2018 RNN
docs/text_generation.ipynb at master · tensorflow/docs
2019 Tookit on top of TF
asyml/texar: Toolkit for Text Generation and Beyond
Collection
brightmart/text_classification: all kinds of text classification models and more with deep learning
2019 Framework
RasaHQ/rasa_nlu:
2018 Chi
crownpku/Rasa_NLU_Chi: Turn Chinese natural language into structured data 中文自然语言理解
2019 Toolkit
snipsco/snips-nlu: Snips Python library to extract meaning from text
2018
5hirish/adam_qas: ADAM - A Question Answering System. Inspired from IBM Watson
2019 Doc+Sentence+Word
gensim: Topic modelling for humans
2019 MinHash
ekzhu/datasketch: MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++
2019 LevenshteinDistance
ztane/python-Levenshtein: The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
2018 Graph
caesar0301/graphsim: Graph similarity algorithms based on NetworkX.
2019 Pinyin
mozillazg/python-pinyin: 汉字转拼音 (pypinyin)
2019 Word
JasonKessler/scattertext: Beautiful visualizations of how language differs among document types.
2019 Bert GPT
jessevig/bertviz: Tool for visualizing attention in the Transformer model (BERT and OpenAI GPT-2)
2019 Kinds of indexes
shivam5992/textstat: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
2019 in Spacy
mholtzscher/spacy_readability: spaCy pipeline component for adding text readability meta data to Doc objects.
2018 Microsoft Based on Phrase
Microsoft/NPMT: Towards Neural Phrase-based Machine Translation
2019 Google Based on Seq2Seq and Attention
tensorflow/nmt: TensorFlow Neural Machine Translation Tutorial
2019 Google Based on Pure Attention
models/official/transformer at master · tensorflow/models
2019 Facebook Based on CNN
pytorch/fairseq: Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
2019 Facebook Based on Unsupervised
facebookresearch/UnsupervisedMT: Phrase-Based & Neural Unsupervised Machine Translation
2019 DeepL Basedon CNN (Not Open Source)
DeepL Translator DeepL 基于 CNN 的翻译工具
2019 OpenNMT
OpenNMT/OpenNMT: Open Source Neural Machine Translation
2019 Word
google-research/bert: TensorFlow code and pre-trained models for BERT
2019 Sentence
hanxiao/bert-as-service: Mapping a variable-length sentence to a fixed-length vector using BERT model
2018 Sentence
explosion/sense2vec:
2019 Sentence
gensim: models.doc2vec – Doc2vec paragraph embeddings
2019 Word
Embedding/Chinese-Word-Vectors: 100+ Chinese Word Vectors 上百种预训练中文词向量
2014 Sentence
klb3713/sentence2vec: Tools for mapping a sentence with arbitrary length to vector space
Question
如何用 word2vec 计算两个句子之间的相似度? - 知乎
2018 LSTM
递归神经网络 | TensorFlow
2019 Translation, Summarization, LM, TextGeneration
pytorch/fairseq: Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
2019 Seq2Seq, SeqTagging, SeqClassification, LM
OpenNMT/OpenNMT: Open Source Neural Machine Translation
2019 QA, LM, Sentiment, SpeechRecognition, Summarization, MT
tensorflow/tensor2tensor: Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Michael Collins, Michael Collins - Google Scholar Citations ☆
Terry Koo, Terry Koo - Google Scholar Citations
Percy Liang, Percy Liang - Google Scholar Citations
Luke Zettlemoyer | Computer Science & Engineering, Luke Zettlemoyer - Google Scholar Citations
Jason Eisner - Home Page (JHU), Jason Eisner - Google Scholar Citations ☆
Noah Smith, Noah A. Smith - Google Scholar Citations, Noah A. Smith - Google Scholar Citations
David Yarowsky, David Yarowsky - Google Scholar Citations
Dan Jurafsky - Home Page, Dan Jurafsky - Google Scholar Citations ☆
Christopher Manning, Stanford NLP, Christopher D Manning - Google Scholar Citations ☆
Richard Socher - Home Page, Richard Socher - Google Scholar Citations ☆
Dan Klein's Home Page, The Berkeley NLP Group ☆
Dan Roth - Main Page, Dan Roth - Google Scholar Citations ☆
ChengXiang Zhai - Home Page, ChengXiang Zhai - Google Scholar Citations
Eugene Charniak's Home Page, Eugene Charniak - Google Scholar Citations
Joakim Nivre's Home Page, Joakim Nivre - Google Scholar Citations ☆
Philipp Koehn, Philipp Koehn - Google Scholar Citations
James H. Martin, James H. Martin - Google Scholar Citations
Julia Hirschberg, Julia Hirschberg - Google Scholar Citations
Fernando Pereira – Google AI, Fernando Pereira - Google Scholar Citations ☆
ryan mcdonald, Ryan McDonald - Google Scholar Citations
Slav Petrov - Слав Петров, Slav Petrov - Google Scholar Citations ☆
Kenneth Church HomePage, Kenneth Ward Church - Google Scholar Citations
NLP(自然语言处理)界有哪些神级人物? - 知乎