TSAR-2022的MANTIS 共同任务:改进未受监督的与未受过训练的编译员的简化法律 (MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical Simplification with Pretrained Encoders) - 专知论文

会员服务 ·

0

无监督 · 过滤式方法 · EMNLP 2022 · Readability · 语义相似度 ·

2022 年 12 月 19 日

MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical Simplification with Pretrained Encoders

翻译：TSAR-2022的MANTIS 共同任务:改进未受监督的与未受过训练的编译员的简化法律

Xiaofei Li,Daniel Wiechmann,Yu Qiao,Elma Kerz

from arxiv, accepted at EMNLP2022

In this paper we present our contribution to the TSAR-2022 Shared Task on Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability. Our approach builds on and extends the unsupervised lexical simplification system with pretrained encoders (LSBert) system in the following ways: For the subtask of simplification candidate selection, it utilizes a RoBERTa transformer language model and expands the size of the generated candidate list. For subsequent substitution ranking, it introduces a new feature weighting scheme and adopts a candidate filtering method based on textual entailment to maximize semantic similarity between the target word and its simplification. Our best-performing system improves LSBert by 5.9% accuracy and achieves second place out of 33 ranked solutions.

翻译：在本文中,我们介绍了我们对2022年关于简化文本、无障碍和可读性的欧洲MNLP 2022年文本简化、无障碍和可读性讲习班的TRAR-2022共同任务的贡献,我们的方法以下列方式为基础,并扩展了未经监督的简化词汇系统,包括预先培训的编码器(LSBert)系统:对于简化候选人甄选的子任务,它使用ROBERTA变压器语言模式,扩大所产生候选人名单的规模。对于随后的替代等级,它引入了新的特征加权办法,并采用了基于文字要求的筛选方法,以尽量扩大目标字词与简化词的语义相似性。我们最优秀的系统将LSBERt改进了5.9%的精度,并在33个排名解决方案中排第二位。

0

相关内容

无监督

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于高通量测序的前列腺癌耐药相关候选DNA甲基化基因鉴定与功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

二元半导体Mg3X2(X=N、P、As、Sb)的高压结构相变研究

国家自然科学基金

0+阅读 · 2013年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

淫羊藿素（ICT）通过激动AhR降解AR的作用抑制前列腺癌的研究

国家自然科学基金

0+阅读 · 2011年12月31日

西南岩溶裂隙-管道介质地下水流运动规律试验研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHH信号通路调控人前列腺癌细胞EMT转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

印迹基因TSSC3在骨肉瘤失巢凋亡过程中的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

Slate-Aware Ranking for Recommendation

Arxiv

0+阅读 · 2023年2月24日

Sentence Simplification via Large Language Models

Arxiv

0+阅读 · 2023年2月23日

IlocA: An algorithm to Cluster Cells and form Imputation Groups from a pair of Classification Variables

Arxiv

0+阅读 · 2023年2月23日

Gender Bias in Text: Labeled Datasets and Lexicons

Arxiv

0+阅读 · 2023年2月23日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

Arxiv

14+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

过滤式方法

语义相似度

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Slate-Aware Ranking for Recommendation

Arxiv

0+阅读 · 2023年2月24日

Sentence Simplification via Large Language Models

Arxiv

0+阅读 · 2023年2月23日

IlocA: An algorithm to Cluster Cells and form Imputation Groups from a pair of Classification Variables

Arxiv

0+阅读 · 2023年2月23日

Gender Bias in Text: Labeled Datasets and Lexicons

Arxiv

0+阅读 · 2023年2月23日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Self-training with Noisy Student improves ImageNet classification

Arxiv

15+阅读 · 2019年11月11日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Semi-supervised Node Classification via Hierarchical Graph Convolutional Networks

Arxiv

14+阅读 · 2019年3月5日

相关基金

基于高通量测序的前列腺癌耐药相关候选DNA甲基化基因鉴定与功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

二元半导体Mg3X2(X=N、P、As、Sb)的高压结构相变研究

国家自然科学基金

0+阅读 · 2013年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

淫羊藿素（ICT）通过激动AhR降解AR的作用抑制前列腺癌的研究

国家自然科学基金

0+阅读 · 2011年12月31日

西南岩溶裂隙-管道介质地下水流运动规律试验研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHH信号通路调控人前列腺癌细胞EMT转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

印迹基因TSSC3在骨肉瘤失巢凋亡过程中的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

干涉SAR与LIDAR森林参数协同反演模型与方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员