字典域中的对立方' Tagger - 新数据集和个人实体链接方法 (Protagonists' Tagger in Literary Domain -- New Datasets and a Method for Person Entity Linkage) - 专知论文

会员服务 ·

0

entity · 命名实体消歧 · 命名实体识别 · Processing（编程语言） · 可辨认的 ·

2021 年 10 月 4 日

Protagonists' Tagger in Literary Domain -- New Datasets and a Method for Person Entity Linkage

翻译：字典域中的对立方' Tagger - 新数据集和个人实体链接方法

Weronika Łajewska,Anna Wróblewska

Semantic annotation of long texts, such as novels, remains an open challenge in Natural Language Processing (NLP). This research investigates the problem of detecting person entities and assigning them unique identities, i.e., recognizing people (especially main characters) in novels. We prepared a method for person entity linkage (named entity recognition and disambiguation) and new testing datasets. The datasets comprise 1,300 sentences from 13 classic novels of different genres that a novel reader had manually annotated. Our process of identifying literary characters in a text, implemented in protagonistTagger, comprises two stages: (1) named entity recognition (NER) of persons, (2) named entity disambiguation (NED) - matching each recognized person with the literary character's full name, based on approximate text matching. The protagonistTagger achieves both precision and recall of above 83% on the prepared testing sets. Finally, we gathered a corpus of 13 full-text novels tagged with protagonistTagger that comprises more than 35,000 mentions of literary characters.

翻译：在自然语言处理(NLP)中,对诸如小说等长篇文字进行语义说明仍然是一项公开的挑战。这项研究调查了发现个人实体和赋予他们独特身份的问题,即识别小说中的人(特别是主要人物)的问题。我们为个人实体联系(名称为实体识别和模糊)和新的测试数据集编写了一份方法。数据集包含13种经典的、由小说读者手动加注的13种不同版本的小说中的1 300个句子。我们在“主角塔格”中实施的在文本中识别文学字符的过程包括两个阶段:(1) 名称为实体识别(NER) 的人,(2) 名称为实体模糊(NED) - 以大约文本匹配为基础,将每个被识别的人与文学字符的全名匹配。主角塔格在准备的测试集中既实现了精确度,又记起了83%以上的记号。最后,我们收集了13种全文本小说小说,由主角塔格(Tagger)加注超过35,000个文学字符。

0

相关内容

entity

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

已删除

将门创投

6+阅读 · 2019年9月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文浅尝 | EARL: Joint Entity and Relation Linking for QA over KG

论文浅尝 | EARL: Joint Entity and Relation Linking for QA over KG

开放知识图谱

6+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

语音识别之--扑朔迷“离”

语音识别之--扑朔迷“离”

微信AI

6+阅读 · 2017年8月9日

KazNERD: Kazakh Named Entity Recognition Dataset

Arxiv

0+阅读 · 2021年11月26日

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Arxiv

3+阅读 · 2020年7月20日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Attention Network Robustification for Person ReID

Attention Network Robustification for Person ReID

Arxiv

5+阅读 · 2019年10月15日

Improving Fine-grained Entity Typing with Entity Linking

Arxiv

3+阅读 · 2019年9月26日

Multi-Grained Named Entity Recognition

Multi-Grained Named Entity Recognition

Arxiv

6+阅读 · 2019年6月20日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems

Arxiv

5+阅读 · 2018年5月10日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

3+阅读 · 2018年4月5日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

命名实体消歧

命名实体识别

Processing（编程语言）

相关VIP内容

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

59+阅读 · 2020年6月30日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

已删除

将门创投

6+阅读 · 2019年9月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

论文浅尝 | EARL: Joint Entity and Relation Linking for QA over KG

论文浅尝 | EARL: Joint Entity and Relation Linking for QA over KG

开放知识图谱

6+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

语音识别之--扑朔迷“离”

语音识别之--扑朔迷“离”

微信AI

6+阅读 · 2017年8月9日

相关论文

KazNERD: Kazakh Named Entity Recognition Dataset

Arxiv

0+阅读 · 2021年11月26日

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Arxiv

3+阅读 · 2020年7月20日

Zero-Resource Cross-Lingual Named Entity Recognition

Arxiv

5+阅读 · 2019年11月22日

Attention Network Robustification for Person ReID

Attention Network Robustification for Person ReID

Arxiv

5+阅读 · 2019年10月15日

Improving Fine-grained Entity Typing with Entity Linking

Arxiv

3+阅读 · 2019年9月26日

Multi-Grained Named Entity Recognition

Multi-Grained Named Entity Recognition

Arxiv

6+阅读 · 2019年6月20日

Dynamic Transfer Learning for Named Entity Recognition

Dynamic Transfer Learning for Named Entity Recognition

Arxiv

5+阅读 · 2019年5月1日

SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems

Arxiv

5+阅读 · 2018年5月10日

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Arxiv

3+阅读 · 2018年4月5日

PEYMA: A Tagged Corpus for Persian Named Entities

Arxiv

5+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员