最简单的BERT模型调用方法 - 专知

会员服务 ·

0

最简单的BERT模型调用方法

2019 年 12 月 23 日 深度学习自然语言处理

点击上方，选择星标或置顶，每天给你送干货！

阅读大概需要6分钟

跟随小博主，每天进步一丢丢

来自：AINLP

作者：段清华（金证优智技术总监）

原文链接：

https://zhuanlan.zhihu.com/p/97926790

项目地址：

https://github.com/qhduan/bert-model

BERT Models

注达到本文效果基本要求Tensorflow 2.0

也许，是最简单的BERT预加载模型。

当然，实现起来是有一些tricky的，而且tokenizer并不是真正的bert的tokenizer，中文大部分不会有太大问题，英文的话实际上因为考虑BPE，所以肯定是不行的。

本项目重点在于，实际上我们是可以通过非常非常简单的几行代码，就能实现一个几乎达到SOTA的模型的。

BERT分类模型（pool模式）

返回一个1x768的向量，相当于句子的固定长度Embedding

根据一个实际Chinese GLUE的测试样例：

import tensorflow_hub as hub

# 注意这里最后是 pool.tar.gz
model = hub.KerasLayer('https://code.aliyun.com/qhduan/chinese_roberta_wwm_ext_L-12_H-768_A-12/raw/master/pool.tar.gz')

# y.shape == (1, 768)
y = model([['我爱你']])

一个非常简单的样例（classifier.py）：

import os
os.environ['TFHUB_DOWNLOAD_PROGRESS'] = "1"
import tensorflow as tf
import tensorflow_hub as hub

x = [
    ['我爱你'],
    ['我恨你'],
    ['爱你'],
    ['恨你'],
    ['爱'],
    ['恨'],
]
y = [
    1, 0, 1, 0, 1, 0
]

tx = tf.constant(x)
ty = tf.constant(tf.keras.utils.to_categorical(y, 2))

# 注意这里最后是 pool.tar.gz
model = tf.keras.Sequential([
  hub.KerasLayer('https://code.aliyun.com/qhduan/chinese_roberta_wwm_ext_L-12_H-768_A-12/raw/master/pool.tar.gz', trainable=False),
  tf.keras.layers.Dense(2, activation='softmax')
])

model.compile(loss='categorical_crossentropy')
model.fit(tx, ty, epochs=10, batch_size=2)
logits = model.predict(tx)
pred = logits.argmax(-1).tolist()

print(pred)
print(y)

BERT序列模型（SEQ）

返回一个序列的Embedding的模型

import tensorflow_hub as hub

# 注意这里最后是 seq.tar.gz
model = hub.KerasLayer('https://code.aliyun.com/qhduan/chinese_roberta_wwm_ext_L-12_H-768_A-12/raw/master/seq.tar.gz')

# y.shape == (1, 5, 768)
# [CLS], 我, 爱, 你, [SEP]，所以一共5个符号
y = model([['我爱你']])

BERT预测模型（PRED）

例如使用mask预测缺字

import tensorflow_hub as hub

# 注意这里最后是 pred.tar.gz
model = hub.KerasLayer('https://code.aliyun.com/qhduan/chinese_roberta_wwm_ext_L-12_H-768_A-12/raw/master/pred.tar.gz')

# y.shape == (1, 5, 21128)
y = model([['我[MASK]你']])

index2word = {k: v.strip() for k, v in enumerate(open('vocab.txt'))}

# 我 爱 你
r = [index2word[i] for i in y.numpy().argmax(-1).flatten()][1:-1]

模型引用

REPO地址：

https://github.com/qhduan/bert-model

Roberta和WMM来自ymcui：

https://github.com/ymcui/Chinese-BERT-wwm

方便交流学习，备注： 昵称-学校（公司）-方向，进入DL&NLP交流群。

方向有很多：机器学习、深度学习，python，情感分析、意见挖掘、句法分析、机器翻译、人机对话、知识图谱、语音识别等。

记得备注呦

推荐阅读：

【ACL 2019】腾讯AI Lab解读三大前沿方向及20篇入选论文

【一分钟论文】IJCAI2019 | Self-attentive Biafﬁne Dependency Parsing

【一分钟论文】 NAACL2019-使用感知句法词表示的句法增强神经机器翻译

【一分钟论文】Semi-supervised Sequence Learning半监督序列学习

【一分钟论文】Deep Biaffine Attention for Neural Dependency Parsing

详解Transition-based Dependency parser基于转移的依存句法解析器

经验 | 初入NLP领域的一些小建议

学术 | 如何写一篇合格的NLP论文

干货 | 那些高产的学者都是怎样工作的？

一个简单有效的联合模型

近年来NLP在法律领域的相关研究工作

让更多的人知道你“在看”

登录查看更多

4

相关内容

BERT

BERT全称Bidirectional Encoder Representations from Transformers，是预训练语言表示的方法，可以在大型文本语料库（如维基百科）上训练通用的“语言理解”模型，然后将该模型用于下游NLP任务，比如机器翻译、问答。

TensorFlow 2.2为keras.Model加入train_step方法，开发者可自由定义模型自动训练过程

TensorFlow 2.2为keras.Model加入train_step方法，开发者可自由定义模型自动训练过程

专知会员服务

36+阅读 · 2020年3月27日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【Amazon】使用预先训练的Transformer模型进行数据增强

【Amazon】使用预先训练的Transformer模型进行数据增强

专知会员服务

58+阅读 · 2020年3月6日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【模型泛化教程】标签平滑与Keras, TensorFlow，和深度学习

【模型泛化教程】标签平滑与Keras, TensorFlow，和深度学习

专知会员服务

21+阅读 · 2019年12月31日

【新书】学习TensorFlow2.0，177页pdf，使用Python实现机器学习和深度学习模型

【新书】学习TensorFlow2.0，177页pdf，使用Python实现机器学习和深度学习模型

专知会员服务

224+阅读 · 2019年12月28日

【干货】用BRET进行多标签文本分类（附代码）

【干货】用BRET进行多标签文本分类（附代码）

专知会员服务

85+阅读 · 2019年12月27日

【机器学习课程】Google机器学习速成课程

【机器学习课程】Google机器学习速成课程

专知会员服务

170+阅读 · 2019年12月2日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

简单高效的Bert中文文本分类模型开发和部署

简单高效的Bert中文文本分类模型开发和部署

AINLP

49+阅读 · 2019年6月6日

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

AI研习社

5+阅读 · 2019年5月12日

【干货】BERT模型的标准调优和花式调优

【干货】BERT模型的标准调优和花式调优

新智元

11+阅读 · 2019年4月26日

使用 Bert 预训练模型文本分类（内附源码）

使用 Bert 预训练模型文本分类（内附源码）

数据库开发

102+阅读 · 2019年3月12日

手把手教 | 使用Bert预训练模型文本分类（内附源码）

手把手教 | 使用Bert预训练模型文本分类（内附源码）

数据派THU

162+阅读 · 2019年3月12日

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

机器之心

15+阅读 · 2019年2月13日

NLP - 基于 BERT 的中文命名实体识别（NER)

NLP - 基于 BERT 的中文命名实体识别（NER)

AINLP

466+阅读 · 2019年2月10日

NLP - 15 分钟搭建中文文本分类模型

NLP - 15 分钟搭建中文文本分类模型

AINLP

79+阅读 · 2019年1月29日

word2vec中文语料训练

word2vec中文语料训练

全球人工智能

12+阅读 · 2018年4月23日

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

Arxiv

6+阅读 · 2019年9月26日

Span-based Joint Entity and Relation Extraction with Transformer Pre-training

Arxiv

7+阅读 · 2019年9月17日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

Fine-tune BERT for Extractive Summarization

Arxiv

21+阅读 · 2019年3月25日

A BERT Baseline for the Natural Questions

Arxiv

8+阅读 · 2019年3月21日

Dialogue Natural Language Inference

Arxiv

7+阅读 · 2018年11月1日

Learning Instance Segmentation by Interaction

Arxiv

6+阅读 · 2018年6月21日

Multi-Reward Reinforced Summarization with Saliency and Entailment

Arxiv

4+阅读 · 2018年4月17日

VIP会员

相关主题

词元分析器

相关VIP内容

TensorFlow 2.2为keras.Model加入train_step方法，开发者可自由定义模型自动训练过程

TensorFlow 2.2为keras.Model加入train_step方法，开发者可自由定义模型自动训练过程

专知会员服务

36+阅读 · 2020年3月27日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【Amazon】使用预先训练的Transformer模型进行数据增强

【Amazon】使用预先训练的Transformer模型进行数据增强

专知会员服务

58+阅读 · 2020年3月6日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【模型泛化教程】标签平滑与Keras, TensorFlow，和深度学习

【模型泛化教程】标签平滑与Keras, TensorFlow，和深度学习

专知会员服务

21+阅读 · 2019年12月31日

【新书】学习TensorFlow2.0，177页pdf，使用Python实现机器学习和深度学习模型

【新书】学习TensorFlow2.0，177页pdf，使用Python实现机器学习和深度学习模型

专知会员服务

224+阅读 · 2019年12月28日

【干货】用BRET进行多标签文本分类（附代码）

【干货】用BRET进行多标签文本分类（附代码）

专知会员服务

85+阅读 · 2019年12月27日

【机器学习课程】Google机器学习速成课程

【机器学习课程】Google机器学习速成课程

专知会员服务

170+阅读 · 2019年12月2日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

使用BERT做文本摘要

使用BERT做文本摘要

专知

23+阅读 · 2019年12月7日

简单高效的Bert中文文本分类模型开发和部署

简单高效的Bert中文文本分类模型开发和部署

AINLP

49+阅读 · 2019年6月6日

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

AI研习社

5+阅读 · 2019年5月12日

【干货】BERT模型的标准调优和花式调优

【干货】BERT模型的标准调优和花式调优

新智元

11+阅读 · 2019年4月26日

使用 Bert 预训练模型文本分类（内附源码）

使用 Bert 预训练模型文本分类（内附源码）

数据库开发

102+阅读 · 2019年3月12日

手把手教 | 使用Bert预训练模型文本分类（内附源码）

手把手教 | 使用Bert预训练模型文本分类（内附源码）

数据派THU

162+阅读 · 2019年3月12日

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

机器之心

15+阅读 · 2019年2月13日

NLP - 基于 BERT 的中文命名实体识别（NER)

NLP - 基于 BERT 的中文命名实体识别（NER)

AINLP

466+阅读 · 2019年2月10日

NLP - 15 分钟搭建中文文本分类模型

NLP - 15 分钟搭建中文文本分类模型

AINLP

79+阅读 · 2019年1月29日

word2vec中文语料训练

word2vec中文语料训练

全球人工智能

12+阅读 · 2018年4月23日

相关论文

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

Arxiv

6+阅读 · 2019年9月26日

Span-based Joint Entity and Relation Extraction with Transformer Pre-training

Arxiv

7+阅读 · 2019年9月17日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

Fine-tune BERT for Extractive Summarization

Arxiv

21+阅读 · 2019年3月25日

A BERT Baseline for the Natural Questions

Arxiv

8+阅读 · 2019年3月21日

Dialogue Natural Language Inference

Arxiv

7+阅读 · 2018年11月1日

Learning Instance Segmentation by Interaction

Arxiv

6+阅读 · 2018年6月21日

Multi-Reward Reinforced Summarization with Saliency and Entailment

Arxiv

4+阅读 · 2018年4月17日

大家都在搜

大型语言模型

生成式人工智能

无人机系统

蓝牙安全攻防

【论文笔记】用于数据驱动交通预测的扩散卷积循环神经网络（DCRNN）

微信扫码咨询专知VIP会员