SMSMix: Word Sen Sense Disamburation 的超常刑罚混合组合 (SMSMix: Sense-Maintained Sentence Mixup for Word Sense Disambiguation) - 专知论文

会员服务 ·

0

Mixup · 词义消歧 · Performer · 可约的 · Extensibility ·

2022 年 12 月 21 日

SMSMix: Sense-Maintained Sentence Mixup for Word Sense Disambiguation

翻译：SMSMix: Word Sen Sense Disamburation 的超常刑罚混合组合

Hee Suk Yoon,Eunseop Yoon,John Harvill,Sunjae Yoon,Mark Hasegawa-Johnson,Chang D. Yoo

from arxiv, EMNLP2022

Word Sense Disambiguation (WSD) is an NLP task aimed at determining the correct sense of a word in a sentence from discrete sense choices. Although current systems have attained unprecedented performances for such tasks, the nonuniform distribution of word senses during training generally results in systems performing poorly on rare senses. To this end, we consider data augmentation to increase the frequency of these least frequent senses (LFS) to reduce the distributional bias of senses during training. We propose Sense-Maintained Sentence Mixup (SMSMix), a novel word-level mixup method that maintains the sense of a target word. SMSMix smoothly blends two sentences using mask prediction while preserving the relevant span determined by saliency scores to maintain a specific word's sense. To the best of our knowledge, this is the first attempt to apply mixup in NLP while preserving the meaning of a specific word. With extensive experiments, we validate that our augmentation method can effectively give more information about rare senses during training with maintained target sense label.

翻译：Wordense Disanderation (WSD) 是一项NLP任务,旨在从离散感的选项中确定一个词的正确感知。虽然当前系统已经为这些任务取得了前所未有的表现, 但培训期间单词感的不统一分布通常导致系统在稀有感知方面表现不佳。为此,我们认为数据增强是为了增加这些最不常见感(LFS)的频率,以减少培训期间感知的分布偏差。我们提出了Sense-Mainedal Page Mixup (SMSMix) (SMSSMix), 这是一种新颖的单词级混和方法, 维持目标字感。 SMSMix在使用掩码预测同时将两个句相混合, 同时保留由突出分数决定的相关范围, 以保持特定的感知力。对于我们的知识来说, 这是第一次尝试在NLP(LP) 中应用混杂, 同时保留特定词的含义。通过广泛的实验, 我们验证我们的扩增方法能够有效地在训练期间以维持目标感标签来提供关于稀有感的稀有感的信息。

0

相关内容

Mixup

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

大蒜化感物质消减番茄连作障碍的生理和分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

连作胁迫下地黄DNA甲基化谱及其与连作障碍的关系

国家自然科学基金

0+阅读 · 2013年12月31日

叶瘿蚊虫害胁迫下荔枝的诱导抗虫性分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

高密度电子封装中金属纳米粒子修饰石墨烯/Sn-Ag-Cu钎料的设计及可靠性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GH/IGF-1轴糖尿病肾病大鼠Snail 1通路及TEMT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Src激酶在2型糖尿病脑动脉BKCa通道功能障碍中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

1p31.1和20p13区基因变异与早年创伤在强迫症发病中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

新疆维吾尔族、哈萨克族和汉族人群支气管哮喘与ADAM33基因SNPs及单体型的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

PLCE1基因及其介导的信号通路在新疆哈萨克族食管癌发生中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

Ontology-aware Network for Zero-shot Sketch-based Image Retrieval

Arxiv

0+阅读 · 2023年2月20日

Mimicking a Pathologist: Dual Attention Model for Scoring of Gigapixel Histology Images

Arxiv

0+阅读 · 2023年2月19日

On the Theories Behind Hard Negative Sampling for Recommendation

Arxiv

0+阅读 · 2023年2月19日

Retinex Image Enhancement Based on Sequential Decomposition With a Plug-and-Play Framework

Arxiv

0+阅读 · 2023年2月17日

High-frequency Matters: An Overwriting Attack and defense for Image-processing Neural Network Watermarking

Arxiv

0+阅读 · 2023年2月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Cross-Domain Image Matching with Deep Feature Maps

Arxiv

14+阅读 · 2018年4月6日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】VideoLucy：用于长视频理解的深度记忆回溯机制

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

【NTU博士论文】端到端鲁棒自动语音识别的最新进展

用于强化学习的扩散模型：基础、分类与发展

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

Ontology-aware Network for Zero-shot Sketch-based Image Retrieval

Arxiv

0+阅读 · 2023年2月20日

Mimicking a Pathologist: Dual Attention Model for Scoring of Gigapixel Histology Images

Arxiv

0+阅读 · 2023年2月19日

On the Theories Behind Hard Negative Sampling for Recommendation

Arxiv

0+阅读 · 2023年2月19日

Retinex Image Enhancement Based on Sequential Decomposition With a Plug-and-Play Framework

Arxiv

0+阅读 · 2023年2月17日

High-frequency Matters: An Overwriting Attack and defense for Image-processing Neural Network Watermarking

Arxiv

0+阅读 · 2023年2月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Cross-Domain Image Matching with Deep Feature Maps

Arxiv

14+阅读 · 2018年4月6日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

相关基金

大蒜化感物质消减番茄连作障碍的生理和分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

连作胁迫下地黄DNA甲基化谱及其与连作障碍的关系

国家自然科学基金

0+阅读 · 2013年12月31日

叶瘿蚊虫害胁迫下荔枝的诱导抗虫性分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

高密度电子封装中金属纳米粒子修饰石墨烯/Sn-Ag-Cu钎料的设计及可靠性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于GH/IGF-1轴糖尿病肾病大鼠Snail 1通路及TEMT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Src激酶在2型糖尿病脑动脉BKCa通道功能障碍中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

1p31.1和20p13区基因变异与早年创伤在强迫症发病中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

新疆维吾尔族、哈萨克族和汉族人群支气管哮喘与ADAM33基因SNPs及单体型的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

PLCE1基因及其介导的信号通路在新疆哈萨克族食管癌发生中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

病理性近视易感基因研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员