Wiki to Automotive:了解分销转变及其对名称实体识别的影响 (Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition) - 专知论文

会员服务 ·

0

entity · 可理解性 · Performer · 命名实体识别 · Performance ·

2021 年 12 月 1 日

Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition

翻译：Wiki to Automotive:了解分销转变及其对名称实体识别的影响

Anmol Nayak,Hari Prasad Timmapathini

from arxiv, 6 pages, 1 figure

While transfer learning has become a ubiquitous technique used across Natural Language Processing (NLP) tasks, it is often unable to replicate the performance of pre-trained models on text of niche domains like Automotive. In this paper we aim to understand the main characteristics of the distribution shift with automotive domain text (describing technical functionalities such as Cruise Control) and attempt to explain the potential reasons for the gap in performance. We focus on performing the Named Entity Recognition (NER) task as it requires strong lexical, syntactic and semantic understanding by the model. Our experiments with 2 different encoders, namely BERT-Base-Uncased and SciBERT-Base-Scivocab-Uncased have lead to interesting findings that showed: 1) The performance of SciBERT is better than BERT when used for automotive domain, 2) Fine-tuning the language models with automotive domain text did not make significant improvements to the NER performance, 3) The distribution shift is challenging as it is characterized by lack of repeating contexts, sparseness of entities, large number of Out-Of-Vocabulary (OOV) words and class overlap due to domain specific nuances.

翻译：虽然转让学习已成为在自然语言处理(NLP)任务中使用的无处不在的技术,但往往无法复制关于汽车等特殊领域文本的预先培训模型。在本文件中,我们的目标是通过汽车域文本(描述巡航控制等技术功能)来理解分销转换的主要特点,并试图解释造成绩效差距的潜在原因。我们侧重于执行命名实体识别(NER)任务,因为它需要该模型强有力的词汇、综合和语义理解。我们与两个不同的编码器(即BERT-Base-uncased和SciBERT-Base-Scivicocab-Uncase-Ucase)的实验得出了有趣的结论:(1) 用于汽车域文本时,SciBERT的性能优于BERT;(2) 使用汽车域文本对语言模型的微调没有显著改进NER的性能。(3) 分配变化具有挑战性,因为其特点是缺乏重复的环境、实体的分散性、大量外空间(OOVV)的字数和与特定域的类别重叠。

0

相关内容

entity

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Awesome-Chinese-NLP：中文自然语言处理相关资料

Awesome-Chinese-NLP：中文自然语言处理相关资料

AINLP

30+阅读 · 2019年2月17日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

A Survey of Unsupervised Domain Adaptation for Visual Recognition

Arxiv

9+阅读 · 2021年12月13日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

Arxiv

4+阅读 · 2021年6月1日

Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

Arxiv

3+阅读 · 2021年5月14日

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Arxiv

6+阅读 · 2020年12月14日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Joint Learning of Named Entity Recognition and Entity Linking

Arxiv

3+阅读 · 2019年7月18日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Few-shot classification in Named Entity Recognition Task

Arxiv

6+阅读 · 2018年12月14日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Awesome-Chinese-NLP：中文自然语言处理相关资料

Awesome-Chinese-NLP：中文自然语言处理相关资料

AINLP

30+阅读 · 2019年2月17日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

A Survey of Unsupervised Domain Adaptation for Visual Recognition

Arxiv

9+阅读 · 2021年12月13日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

Arxiv

4+阅读 · 2021年6月1日

Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

Arxiv

3+阅读 · 2021年5月14日

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Arxiv

6+阅读 · 2020年12月14日

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements

Arxiv

16+阅读 · 2020年1月2日

Joint Learning of Named Entity Recognition and Entity Linking

Arxiv

3+阅读 · 2019年7月18日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Few-shot classification in Named Entity Recognition Task

Arxiv

6+阅读 · 2018年12月14日

微信扫码咨询专知VIP会员