NLP-LTU at SemEval-2023 Task 10: The Impact of Data Augmentation and Semi-Supervised Learning Techniques on Text Classification Performance on an Imbalanced Dataset - 专知论文

会员服务 ·

0

Performer · Learning · 数据增强 · 文本分类 · MoDELS ·

2023 年 4 月 25 日

NLP-LTU at SemEval-2023 Task 10: The Impact of Data Augmentation and Semi-Supervised Learning Techniques on Text Classification Performance on an Imbalanced Dataset

翻译：暂无翻译

Sana Sabah Al-Azzawi,György Kovács,Filip Nilsson,Tosin Adewumi,Marcus Liwicki

from arxiv, 6 pages, 5 figures , This paper has beed accepted in SemEval workshop at ACL 2023 conference

In this paper, we propose a methodology for task 10 of SemEval23, focusing on detecting and classifying online sexism in social media posts. The task is tackling a serious issue, as detecting harmful content on social media platforms is crucial for mitigating the harm of these posts on users. Our solution for this task is based on an ensemble of fine-tuned transformer-based models (BERTweet, RoBERTa, and DeBERTa). To alleviate problems related to class imbalance, and to improve the generalization capability of our model, we also experiment with data augmentation and semi-supervised learning. In particular, for data augmentation, we use back-translation, either on all classes, or on the underrepresented classes only. We analyze the impact of these strategies on the overall performance of the pipeline through extensive experiments. while for semi-supervised learning, we found that with a substantial amount of unlabelled, in-domain data available, semi-supervised learning can enhance the performance of certain models. Our proposed method (for which the source code is available on Github attains an F1-score of 0.8613 for sub-taskA, which ranked us 10th in the competition

翻译：暂无翻译

0

相关内容

Performer

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

酸敏感离子通道调控负性记忆的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Importinβ2 介导少突胶质细胞转录因子Olig1核浆转位的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

间充质干细胞克隆清除诱导移植耐受新机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

脑损伤过程中星形胶质细胞保护神经元的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification

Arxiv

0+阅读 · 2023年6月8日

Leveraging Language Identification to Enhance Code-Mixed Text Classification

Arxiv

0+阅读 · 2023年6月8日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

Arxiv

100+阅读 · 2020年2月20日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

星链与未来战争

《黑蜂（Black Hummingbird）微型无人机》

《全球地缘政治环境中的反无人机系统互操作性》252页

《美国：为自动驾驶汽车铺平道路——未来出行已来》最新43页报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification

Arxiv

0+阅读 · 2023年6月8日

Leveraging Language Identification to Enhance Code-Mixed Text Classification

Arxiv

0+阅读 · 2023年6月8日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

Arxiv

100+阅读 · 2020年2月20日

相关基金

酸敏感离子通道调控负性记忆的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Importinβ2 介导少突胶质细胞转录因子Olig1核浆转位的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

间充质干细胞克隆清除诱导移植耐受新机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

脑损伤过程中星形胶质细胞保护神经元的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员