推荐|清华老师推荐30来项算法代码和工具包列表(开源)

2018 年 3 月 26 日 全球人工智能

高薪招聘兼职AI讲师和AI助教!


清华刘知远老师花半天功夫整理了最近几年和同学努力开源的三十来项算法代码和工具包列表。包括知识表示、关系抽取、义原计算、语义表示、网络表示、文本处理等各种任务,基本都放在GitHub上了。

Highlight Packages

  • THULAC: An Efficient Lexical Analyzer for Chinese.

    [home]http://thulac.thunlp.org/

    [Git C++]https://github.com/thunlp/thulac

    [Git Java]https://github.com/thunlp/THULAC-Java

    [Git Python]https://github.com/thunlp/THULAC-Python

  • THUCTC: An Efficient Chinese Text Classifier.

    [home]:http://thuctc.thunlp.org/

    [Git Java]https://github.com/thunlp/THUCTC

  • THUOCL: Open Chinese Lexicon.

    [home]http://thuocl.thunlp.org/target=_blank

  • OpenKE: An Open-Source Package for Knowledge Embedding (KE).

    [home]http://openke.thunlp.org/

    [Git] https://github.com/thunlp/OpenKE

  • OpenNE: An Open-Source Package for Network Embedding (NE).

    [Git] https://github.com/thunlp/OpenNE

Knowledge Graph and Relation Extraction

  • NRE: An Open-Source Package for Neural Relation Extraction.

    [Git]https://github.com/thunlp/NRE

    [TensorFlow Version] https://github.com/thunlp/TensorFlow-NRE
    Neural relation extraction aims to extract relations from plain text with neural models, which has been the state-of-the-art methods for relation extraction. In this package, we provide our implementations of CNN [Zeng et al., 2014] and PCNN [Zeng et al.,2015] and their extended version with sentence-level attention scheme [Lin et al., 2016].

  • JointNRE: Joint Neural Relation Extraction with Text and KGs.

    [Git] https://github.com/thunlp/JointNRE
    This is the lab code of our AAAI 2018 paper "Neural Knowledge Acquisition via Mutual Attention between Knowledge Graph and Text".

  • PathNRE: Neural Relation Extraction with Relation Paths.

    [Git] https://github.com/thunlp/PathNRE
    This is the lab code of our EMNLP 2017 paper "Incorporating Relation Paths in Neural Relation Extraction".

  • Neural Entity Alignment.

    [Git] https://github.com/thunlp/IEAJKE
    This is the lab code of our IJCAI 2017 paper "Iterative Entity Alignment via Joint Knowledge Embeddings".

  • Neural Entity Typing.

    [Git] https://github.com/thunlp/KNET
    This is the lab code of our AAAI 2018 paper "Improving Neural Fine-Grained Entity Typing with Knowledge Attention".

Knowledge Representation Learning

  • OpenKE: An Open-Source Package for Knowledge Embedding (KE).

    [Git] https://github.com/thunlp/OpenKE

  • KRLPapers: Must-read papers on knowledge representation learning (KRL) / knowledge embedding (KE).

    [Git] https://github.com/thunlp/KRLPapers

  • TransX: An Efficient implementation of TransE and its extended models for Knowledge Representation Learning.

    [Git]https://github.com/thunlp/Fast-TransX

    [TensorFlow Version] https://github.com/thunlp/TensorFlow-TransX

  • KB2E: A package of Knowledge Base to Embeddings.

    [Git] https://github.com/thunlp/KB2E
    The package contains state-of-the-art knowledge representation learning methods including TransE, TransH, TransR and PTransE.

  • KR-EAR: Knowledge Representation Learning with Entities, Attributes and Relations. [Git] 
    This is the lab code of our IJCAI 2016 paper "Knowledge Representation Learning with Entities, Attributes and Relations".

  • CKRL: Confidence-aware Knowledge Representation Learning.

    [Git] https://github.com/thunlp/CKRL
    This is the lab code of our AAAI 2018 paper "Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning with Confidence". The method is expected to support robust knowledge representation learning with noisy triples.

  • IKRL: Image-embodied Knowledge Representation Learning.

    [Git] https://github.com/thunlp/IKRL
    This is the lab code of our IJCAI 2017 paper "Image-embodied Knowledge Representation Learning". The method is expected to support knowledge representation learning with entity images.

  • TKRL: Type-embodied Knowledge Representation Learning

    [Git] https://github.com/thunlp/TKRL

  • This is the lab code of our IJCAI 2016 paper "Representation Learning of Knowledge Graphs with Hierarchical Types". The method is expected to support knowledge representation learning with hierarchical types of entities.

  • DKRL: Description-embodied Knowledge Representation Learning.

    [Git] https://github.com/thunlp/DKRL
    This is the lab code of our AAAI 2016 paper "Representation Learning of Knowledge Graphs with Entity Descriptions". The method is expected to support knowledge representation learning with entity descriptions.

Network Representation Learning

  • OpenNE: An Open-Source Package for Network Embedding (NE).

    [Git] https://github.com/thunlp/OpenNE

  • NRLPapers: Must-read papers on network representation learning (NRL) / network embedding (NE).

    [Git] https://github.com/thunlp/NRLPapers

  • TransNet: Translation-Based Network Representation Learning.

    [Git] https://github.com/thunlp/TransNet
    This is the lab code of our IJCAI 2017 paper "TransNet: Translation-Based Network Representation Learning for Social Relation Extraction". The method is expected to model social networks by regarding relations as the translation between vertices.

  • NEU: Fast Network Embedding.

    [Git] https://github.com/thunlp/NEU
    This is the lab code of our IJCAI 2017 paper "Fast Network Embedding Enhancement via High Order Proximity Approximation". The method is expected to speed up network embedding by approximate update algorithm.

  • CANE: Context-Aware Network Embedding.

  • [Git] https://github.com/thunlp/CANE
    This is the lab code of our ACL 2017 paper "CANE: Context-Aware Network Embedding for Relation Modeling". The method is expected to support context-aware network representation learning and model asymmetric relations.

  • MMDW: Max-Margin DeepWalk.

    [Git] https://github.com/thunlp/MMDW
    This is the lab code of our IJCAI 2016 paper "Max-Margin DeepWalk: Discriminative Learning of Network Representation". The method is expected to support discriminative network representation learning with node labels.

  • TADW: Text-Associated DeepWalk.

    [Git] https://github.com/thunlp/TADW
    This is the lab code of our IJCAI 2015 paper "Network Representation Learning with Rich Text Information". The method is expected to support network representation learning with rich text information within each node. The code requires a 64-bit linux machine with MATLAB installed.

Sememe-Driven NLP

  • SE-WRL: Improved Word Representation Learning with Sememes.

    [Git] https://github.com/thunlp/SE-WRL
    This is the lab code of our ACL 2017 paper "Improved Word Representation Learning with Sememes". Sememes are minimum semantic units of word meanings, and the meaning of each word sense is typically composed by several sememes. We proposed the improved word representation learning method with sememe knowledge annotated in HowNet.

  • Lexical Sememe Prediction.

    [Git] https://github.com/thunlp/sememe_prediction
    This is the lab code of our IJCAI 2017 paper "Lexical Sememe Prediction via Word Embeddings and Matrix Factorization".

  • Chinese LIWC Lexicon Expansion: Online Interpretable Word Embeddings.

  • [Git] https://github.com/thunlp/Auto_CLIWC
    This is the lab code of our AAAI 2018 paper "Chinese LIWC Lexicon Expansion via Hierarchical Classification of Word Embeddings with Sememe Attention".

Language Representation Learning

  • CWE: Character Word Embeddings.

    [Git] https://github.com/Leonard-Xu/CWE
    This is the lab code of our IJCAI 2015 paper "Joint Learning of Character and Word Embeddings". This method is expected to learn Chinese word embeddings by taking those characters within words into consideration. The analogical reasoning dataset on Chinese is available in data folder.

  • CLWE: Cross-Lingual Word Embeddings.

    [home] http://nlp.csai.tsinghua.edu.cn/~lzy/src/acl2015_bilingual.html
    This is the lab code of our ACL 2015 short paper "Learning Cross-lingual Word Embeddings via Matrix Co-factorization". This method is expected to learn cross-lingual word embeddings with a matrix co-factorization framework.

  • OIWE: Online Interpretable Word Embeddings.

    [Git] https://github.com/SkTim/OIWE
    This is the lab code of our EMNLP 2015 short paper "Online Learning of Interpretable Word Embeddings". This method is expected to learn interpretable word embeddings based on OIWE-IPG model proposed in our paper.

  • TWE: Topical Word Embeddings.

    [Git] https://github.com/thunlp/topical_word_embeddings
    This is the lab code of our AAAI 2015 paper "Topical Word Embeddings". The method is expected to perform representation learning of words with their topic assignments by latent topic models such as Latent Dirichlet Allocation.

General NLP

  • THUCKE: An Open-Source Package for Chinese Keyphrase Extraction.

    [Git]https://github.com/thunlp/THUCKE
    The package can efficiently extract Chinese keyphrases by translating from documents to keyphrases, learned by word alignment models (WAM) that we propoased in[EMNLP][CoNLL].

  • TensorFlow-Summarization: An Open-Source Package for Neural Headline Generation. [Git]

    https://github.com/thunlp/TensorFlow-Summarization

    This is an implementation of sequence-to-sequence model using a bidirectional GRU encoder and a GRU decoder. This project aims to help people start working on Abstractive Short Text Summarization immediately. And hopefully, it may also work on machine translation tasks.

  • THUNSC: An Open-Source Package for Neural Sentiment Classification.

    [Git]https://github.com/thunlp/NSC
    Neural Sentiment Classification aims to classify the sentiment in a document with neural models, which has been the state-of-the-art methods for sentiment classification. In this package, we provide our implementations of NSC, NSC+LA and NSC+UPA[Chen et al., 2016] in which user and product information is considered via attentions over different semantic levels.

  • THUTAG: An Open-Source Package for Keyphrase Extraction and Social Tag Suggestion. [Git]

    https://github.com/thunlp/THUTag

    The package contains several keyphrase extraction methods including TextRank, ExpandRank, Topical PageRank and WAM, and social tag suggestion methods including KNN, PMI, TagLDA, TAM and WTM. The package has supported one of the most popular microblog apps, Weibo Keywords, which has got more than 3.5 million registered users.

  • PLDA+: An Open-Source Package for Parallel LDA.

    [Git] https://code.google.com/archive/p/plda/
    PLDA is a parallel C++ implementation of Latent Dirichlet Allocation (LDA). We present a highly optimized parallel implemention of the Gibbs sampling algorithm for the training/inference of LDA. The carefully designed architecture is expected to support extensions of this algorithm. PLDA+, an enhanced parallel implementation of LDA, can further improve scalability of LDA by significantly reducing the unparallelizable communication bottleneck and achieve good load balancing.

原文:http://nlp.csai.tsinghua.edu.cn/~lzy/codes.html

-马上学习AI挑战百万年薪-

点击“阅读原文”,查看详情

登录查看更多
26

相关内容

Git 是一个为了更好地管理 Linux 内核开发而创立的分布式版本控制和软件配置管理软件。 国内外知名 Git 代码托管网站有: GitHub.com Coding.net code.csdn.net ...
专知会员服务
60+阅读 · 2020年3月19日
专知会员服务
109+阅读 · 2020年3月12日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
五篇 ICCV 2019 的【图神经网络(GNN)+CV】相关论文
专知会员服务
14+阅读 · 2020年1月9日
【课程推荐】普林斯顿陈丹琦COS 484: 自然语言处理课程
专知会员服务
82+阅读 · 2019年12月11日
Keras作者François Chollet推荐的开源图像搜索引擎项目Sis
专知会员服务
29+阅读 · 2019年10月17日
【新书】Python编程基础,669页pdf
专知会员服务
193+阅读 · 2019年10月10日
计算机视觉最佳实践、代码示例和相关文档
专知会员服务
17+阅读 · 2019年10月9日
Github项目推荐 | 图神经网络(GNN)相关资源大列表
2018机器学习开源资源盘点
专知
6+阅读 · 2019年2月2日
刘知远:近年来开源的算法代码、工具包列表
数据派THU
6+阅读 · 2018年3月27日
各厂推荐算法!
程序猿
17+阅读 · 2018年1月13日
【推荐】自动特征工程开源框架
机器学习研究会
17+阅读 · 2017年11月7日
推荐|深度学习PyTorch的教程代码
全球人工智能
10+阅读 · 2017年10月8日
【推荐】TensorFlow手把手CNN实践指南
机器学习研究会
5+阅读 · 2017年8月17日
Arxiv
3+阅读 · 2019年9月5日
Deep Learning in Video Multi-Object Tracking: A Survey
Arxiv
57+阅读 · 2019年7月31日
Arxiv
4+阅读 · 2017年10月30日
VIP会员
相关VIP内容
专知会员服务
60+阅读 · 2020年3月19日
专知会员服务
109+阅读 · 2020年3月12日
抢鲜看!13篇CVPR2020论文链接/开源代码/解读
专知会员服务
49+阅读 · 2020年2月26日
五篇 ICCV 2019 的【图神经网络(GNN)+CV】相关论文
专知会员服务
14+阅读 · 2020年1月9日
【课程推荐】普林斯顿陈丹琦COS 484: 自然语言处理课程
专知会员服务
82+阅读 · 2019年12月11日
Keras作者François Chollet推荐的开源图像搜索引擎项目Sis
专知会员服务
29+阅读 · 2019年10月17日
【新书】Python编程基础,669页pdf
专知会员服务
193+阅读 · 2019年10月10日
计算机视觉最佳实践、代码示例和相关文档
专知会员服务
17+阅读 · 2019年10月9日
相关资讯
Github项目推荐 | 图神经网络(GNN)相关资源大列表
2018机器学习开源资源盘点
专知
6+阅读 · 2019年2月2日
刘知远:近年来开源的算法代码、工具包列表
数据派THU
6+阅读 · 2018年3月27日
各厂推荐算法!
程序猿
17+阅读 · 2018年1月13日
【推荐】自动特征工程开源框架
机器学习研究会
17+阅读 · 2017年11月7日
推荐|深度学习PyTorch的教程代码
全球人工智能
10+阅读 · 2017年10月8日
【推荐】TensorFlow手把手CNN实践指南
机器学习研究会
5+阅读 · 2017年8月17日
Top
微信扫码咨询专知VIP会员