Facebook PyText 在 Github 上开源了

2018 年 12 月 14 日 AINLP

前些天留下的悬念，现在已经开放了，基于PyTorch的深度学习NLP框架，github地址，点击阅读原文可直达：

https://github.com/facebookresearch/pytext

A natural language modeling framework based on PyTorch

PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapid experimentation and of serving models at scale. It achieves this by providing simple and extensible interfaces and abstractions for model components, and by using PyTorch’s capabilities of exporting models for inference via the optimized Caffe2 execution engine. We are using PyText in Facebook to iterate quickly on new modeling ideas and then seamlessly ship them at scale.

Core PyText features:

Production ready models for various NLP/NLU tasks:

Zhang et al. (2016): A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding
Lample et al. (2016): Neural Architectures for Named Entity Recognition
Yoon Kim (2014): Convolutional Neural Networks for Sentence Classification
Lin et al. (2017): A Structured Self-attentive Sentence Embedding
Text classifiers
Sequence taggers
Joint intent-slot model
Contextual intent-slot models

Distributed-training support built on the new C10d backend in PyTorch 1.0
Extensible components that allows easy creation of new models and tasks
Reference implementation and a pretrained model for the paper: Gupta et al. (2018): Semantic Parsing for Task Oriented Dialog using Hierarchical Representations
Ensemble training support

Installing PyText

To get started on a Cloud VM, checkout our guide

We recommend using a virtualenv:

  $ python3 -m virtualenv venv
  $ source pytext/bin/activate
  (venv) $ pip install pytext-nlp

Detailed instructions can be found in our Documentation

Train your first text classifier

For this first example, we'll train a CNN-based text-classifier that classifies text utterances, using the examples in tests/data/train_data_tiny.tsv.

  (venv) $ pytext train < demo/configs/docnn.json

By default, the model is created in /tmp/model.pt

Now you can export your model as a caffe2 net:

  (venv) $ pytext export < config.json

You can use the exported caffe2 model to predict the class of raw utterances like this:

  (venv) $ pytext --config-file config.json predict <<< '{"raw_text": "create an alarm for 1:30 pm"}'

License

PyText is BSD-licensed, as found in the LICENSE file.

登录查看更多

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/