图尔库插注指南 (Annotation Guidelines for the Turku Paraphrase Corpus) - 专知论文

会员服务 ·

0

基 · 标注 · 缩放 · 自然语言处理 ·

2021 年 8 月 19 日

Annotation Guidelines for the Turku Paraphrase Corpus

翻译：图尔库插注指南

Jenna Kanerva,Filip Ginter,Li-Hsin Chang,Iiro Rastas,Valtteri Skantsi,Jemina Kilpeläinen,Hanna-Mari Kupari,Aurora Piirto,Jenna Saarni,Maija Sevón,Otto Tarkka

from arxiv, The Turku Paraphrase Corpus is available at https://turkunlp.org/paraphrase.html

This document describes the annotation guidelines used to construct the Turku Paraphrase Corpus. These guidelines were developed together with the corpus annotation, revising and extending the guidelines regularly during the annotation work. Our paraphrase annotation scheme uses the base scale 1-4, where labels 1 and 2 are used for negative candidates (not paraphrases), while labels 3 and 4 are paraphrases at least in the given context if not everywhere. In addition to base labeling, the scheme is enriched with additional subcategories (flags) for categorizing different types of paraphrases inside the two positive labels, making the annotation scheme suitable for more fine-grained paraphrase categorization. The annotation scheme is used to annotate over 100,000 Finnish paraphrase pairs.

翻译：本文件介绍了用于构建图尔库·帕拉斯潘·科普斯的注解指南。这些指南是在注解工作期间与文体注解、定期修订和扩展指南一起制定的。我们的注解方案使用1-4基准等级表,其中标签1和2用于负数候选人(而不是副词句),而标签3和4在特定情况下至少是引言,如果不是在所有地方的话。除了基本标签之外,还添加了额外的子类(旗号),用于对两种正面标签内的不同类型的注解进行分类,使注解方案适合于更精细的注解参数分类。注计划用于注解超过10万对芬兰方的注解。

0

相关内容

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

【华南理工大学】无监督多类域自适应:理论、算法和实践，Unsupervised Multi-Class DA

专知会员服务

28+阅读 · 2020年3月2日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

一文读懂命名实体识别

一文读懂命名实体识别

人工智能头条

33+阅读 · 2019年3月29日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

半监督多任务学习：Semisupervised Multitask Learning

半监督多任务学习：Semisupervised Multitask Learning

我爱读PAMI

18+阅读 · 2018年4月29日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

Large Scale Substitution-based Word Sense Induction

Arxiv

0+阅读 · 2021年10月14日

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

Arxiv

0+阅读 · 2021年10月13日

Quantifying With Only Positive Training Data

Arxiv

0+阅读 · 2021年10月12日

Toward Annotator Group Bias in Crowdsourcing

Arxiv

0+阅读 · 2021年10月8日

An end-to-end Neural Network Framework for Text Clustering

An end-to-end Neural Network Framework for Text Clustering

Arxiv

6+阅读 · 2019年3月22日

QuAC : Question Answering in Context

QuAC : Question Answering in Context

Arxiv

4+阅读 · 2018年8月21日

Comparative Analysis of Neural QA models on SQuAD

Arxiv

6+阅读 · 2018年6月18日

Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach

Arxiv

7+阅读 · 2018年5月22日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

SentiPers: A Sentiment Analysis Corpus for Persian

Arxiv

5+阅读 · 2018年1月23日

VIP会员

文章信息

相关主题

自然语言处理

相关VIP内容

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

【华南理工大学】无监督多类域自适应:理论、算法和实践，Unsupervised Multi-Class DA

专知会员服务

28+阅读 · 2020年3月2日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向可扩展深度神经网络的预测编码：理论与实践

如何快速获取数百万架无人机？

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

组合式零样本学习综述

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

一文读懂命名实体识别

一文读懂命名实体识别

人工智能头条

33+阅读 · 2019年3月29日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

半监督多任务学习：Semisupervised Multitask Learning

半监督多任务学习：Semisupervised Multitask Learning

我爱读PAMI

18+阅读 · 2018年4月29日

上百份文字的检测与识别资源，包含数据集、code和paper

上百份文字的检测与识别资源，包含数据集、code和paper

数据挖掘入门与实战

17+阅读 · 2017年12月7日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

相关论文

Large Scale Substitution-based Word Sense Induction

Arxiv

0+阅读 · 2021年10月14日

Masader: Metadata Sourcing for Arabic Text and Speech Data Resources

Arxiv

0+阅读 · 2021年10月13日

Quantifying With Only Positive Training Data

Arxiv

0+阅读 · 2021年10月12日

Toward Annotator Group Bias in Crowdsourcing

Arxiv

0+阅读 · 2021年10月8日

An end-to-end Neural Network Framework for Text Clustering

An end-to-end Neural Network Framework for Text Clustering

Arxiv

6+阅读 · 2019年3月22日

QuAC : Question Answering in Context

QuAC : Question Answering in Context

Arxiv

4+阅读 · 2018年8月21日

Comparative Analysis of Neural QA models on SQuAD

Arxiv

6+阅读 · 2018年6月18日

Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach

Arxiv

7+阅读 · 2018年5月22日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

SentiPers: A Sentiment Analysis Corpus for Persian

Arxiv

5+阅读 · 2018年1月23日

微信扫码咨询专知VIP会员