一种容易不受监督的判刑嵌入法 (WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach) - 专知论文

会员服务 ·

0

Better · 无监督 · 向量化 · Seven · MoDELS ·

2021 年 4 月 6 日

WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

翻译：一种容易不受监督的判刑嵌入法

Junjie Huang,Duyu Tang,Wanjun Zhong,Shuai Lu,Linjun Shou,Ming Gong,Daxin Jiang,Nan Duan

Producing the embedding of a sentence in an unsupervised way is valuable to natural language matching and retrieval problems in practice. In this work, we conduct a thorough examination of pretrained model based unsupervised sentence embeddings. We study on four pretrained models and conduct massive experiments on seven datasets regarding sentence semantics. We have there main findings. First, averaging all tokens is better than only using [CLS] vector. Second, combining both top andbottom layers is better than only using top layers. Lastly, an easy whitening-based vector normalization strategy with less than 10 lines of code consistently boosts the performance.

翻译：以不受监督的方式将句子嵌入,对于自然语言的匹配和检索在实践中存在问题很有价值。在这项工作中,我们彻底检查了以未经监督的句子嵌入为基础的预培训模型。我们研究了四个预培训模型,对7个关于判决语义的数据集进行了大规模实验。我们得出了主要结论。首先,平均所有符号比仅仅使用 [CLS] 矢量要好。其次,将上层和下层结合起来比仅使用顶层要好。最后,基于白化的矢量正常化战略,其代码线小于10行,持续提升了性能。

1

相关内容

Better

【CVPR2021】基于端到端预训练的视觉-语言表征学习

【CVPR2021】基于端到端预训练的视觉-语言表征学习

专知会员服务

38+阅读 · 2021年4月9日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【COLING2020】无监督依存解析的综述论文，12页pdf

【COLING2020】无监督依存解析的综述论文，12页pdf

专知会员服务

16+阅读 · 2020年10月27日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【Google AI论文】无妥协的弱监督解缠，Weakly-Supervised Disentanglement Without Compromises

【Google AI论文】无妥协的弱监督解缠，Weakly-Supervised Disentanglement Without Compromises

专知会员服务

20+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT源码分析PART I

BERT源码分析PART I

AINLP

38+阅读 · 2019年7月12日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

A Semantic-based Method for Unsupervised Commonsense Question Answering

Arxiv

1+阅读 · 2021年5月31日

Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

Arxiv

0+阅读 · 2021年5月31日

fastHan: A BERT-based Multi-Task Toolkit for Chinese NLP

Arxiv

1+阅读 · 2021年5月31日

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

Arxiv

1+阅读 · 2021年5月27日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

A Simple BERT-Based Approach for Lexical Simplification

A Simple BERT-Based Approach for Lexical Simplification

Arxiv

6+阅读 · 2019年7月16日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

Unsupervised Image Captioning

Arxiv

7+阅读 · 2018年11月27日

Unsupervised Multilingual Word Embeddings

Arxiv

4+阅读 · 2018年9月6日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2021】基于端到端预训练的视觉-语言表征学习

【CVPR2021】基于端到端预训练的视觉-语言表征学习

专知会员服务

38+阅读 · 2021年4月9日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【COLING2020】无监督依存解析的综述论文，12页pdf

【COLING2020】无监督依存解析的综述论文，12页pdf

专知会员服务

16+阅读 · 2020年10月27日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【Google AI论文】无妥协的弱监督解缠，Weakly-Supervised Disentanglement Without Compromises

【Google AI论文】无妥协的弱监督解缠，Weakly-Supervised Disentanglement Without Compromises

专知会员服务

20+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

BERT源码分析PART I

BERT源码分析PART I

AINLP

38+阅读 · 2019年7月12日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

A Semantic-based Method for Unsupervised Commonsense Question Answering

Arxiv

1+阅读 · 2021年5月31日

Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

Arxiv

0+阅读 · 2021年5月31日

fastHan: A BERT-based Multi-Task Toolkit for Chinese NLP

Arxiv

1+阅读 · 2021年5月31日

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations

Arxiv

1+阅读 · 2021年5月27日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

A Simple BERT-Based Approach for Lexical Simplification

A Simple BERT-Based Approach for Lexical Simplification

Arxiv

6+阅读 · 2019年7月16日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

Unsupervised Image Captioning

Arxiv

7+阅读 · 2018年11月27日

Unsupervised Multilingual Word Embeddings

Arxiv

4+阅读 · 2018年9月6日

微信扫码咨询专知VIP会员