探索深度学习方法在德拉维达语言中检测恐同和恐Trans现象的应用 (Detection of Homophobia & Transphobia in Dravidian Languages: Exploring Deep Learning Methods) - 专知论文

会员服务 ·

0

TransE · 社交 · 学习模型 · 记忆网络 · 社交媒体 ·

2023 年 4 月 3 日

Detection of Homophobia & Transphobia in Dravidian Languages: Exploring Deep Learning Methods

翻译：探索深度学习方法在德拉维达语言中检测恐同和恐Trans现象的应用

Deepawali Sharma,Vedika Gupta,Vivek Kumar Singh

The increase in abusive content on online social media platforms is impacting the social life of online users. Use of offensive and hate speech has been making so-cial media toxic. Homophobia and transphobia constitute offensive comments against LGBT+ community. It becomes imperative to detect and handle these comments, to timely flag or issue a warning to users indulging in such behaviour. However, automated detection of such content is a challenging task, more so in Dravidian languages which are identified as low resource languages. Motivated by this, the paper attempts to explore applicability of different deep learning mod-els for classification of the social media comments in Malayalam and Tamil lan-guages as homophobic, transphobic and non-anti-LGBT+content. The popularly used deep learning models- Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) using GloVe embedding and transformer-based learning models (Multilingual BERT and IndicBERT) are applied to the classification problem. Results obtained show that IndicBERT outperforms the other imple-mented models, with obtained weighted average F1-score of 0.86 and 0.77 for Malayalam and Tamil, respectively. Therefore, the present work confirms higher performance of IndicBERT on the given task in selected Dravidian languages.

翻译：在线社交平台上滥用内容增加, 影响了用户的社交生活。攻击性和仇恨言论使得社交媒体变得有毒。恐同和恐Trans构成了对LGBT+社区的攻击性评论。及时检测和处理这些评论变得至关重要，以便及时对沉迷于这种行为的用户发出警告。然而，自动检测这种内容是具有挑战性的任务，尤其是在被认为是低资源语言的德拉维达语中。本文试图探索不同深度学习模型在马拉雅拉姆语和泰米尔语的社交媒体评论分类问题中的适用性，包括常用的卷积神经网络(CNN)、使用GloVe嵌入的长短时记忆网络(LSTM)和基于transformer的学习模型(Multilingual BERT和IndicBERT)。实验结果表明，IndicBERT在所选的德拉维达语言中表现优异，对马拉雅拉姆语和泰米尔语分别获得了加权平均F1分数为0.86和0.77，因此，本研究证实了IndicBERT在所选的德拉维达语言中在给定的任务上具有更高的性能。

0

相关内容

TransE

TransE: 多元关系数据嵌入（Translation embeddings for modeling multi-relation data）,知识表示的一种算法。

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

深度学习在数学推理中的应用综述

深度学习在数学推理中的应用综述

专知会员服务

48+阅读 · 2022年12月25日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

YOD1促进非小细胞肺癌干细胞自我更新的靶蛋白鉴定及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度神经网络的自动作文评分算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

脉冲微分系统的极小周期与概周期问题

国家自然科学基金

0+阅读 · 2014年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Ontology的藏文语料库检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

丁香假单胞菌番茄变种隐含的saxE基因影响病菌定植及致病性的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models

Arxiv

0+阅读 · 2023年5月23日

Using In-Context Learning to Improve Dialogue Safety

Arxiv

0+阅读 · 2023年5月23日

Deepfake Text Detection in the Wild

Arxiv

0+阅读 · 2023年5月22日

EEG and EMG dataset for the detection of errors introduced by an active orthosis device

Arxiv

0+阅读 · 2023年5月19日

Persian Typographical Error Type Detection using Many-to-Many Deep Neural Networks on Algorithmically-Generated Misspellings

Arxiv

0+阅读 · 2023年5月19日

A Survey of Deep Graph Clustering: Taxonomy, Challenge, and Application

Arxiv

13+阅读 · 2022年11月23日

Advances in adversarial attacks and defenses in computer vision: A survey

Arxiv

22+阅读 · 2021年9月2日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Recent Advances in Deep Learning-based Dialogue Systems

Arxiv

18+阅读 · 2021年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

深度学习在数学推理中的应用综述

深度学习在数学推理中的应用综述

专知会员服务

48+阅读 · 2022年12月25日

图挖掘与多关系学习，亚马逊与CMU-WWW2021教程，附161页ppt

专知会员服务

37+阅读 · 2021年4月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

851页！《潮涨之海：代数几何的基础》新书

从二维到三维认知：通用世界模型简要综述

航天遥感大模型发展综述与产业化应用展望

WWW 2025 | 基于模式引导的多智能体协同知识抽取框架

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Debiasing should be Good and Bad: Measuring the Consistency of Debiasing Techniques in Language Models

Arxiv

0+阅读 · 2023年5月23日

Using In-Context Learning to Improve Dialogue Safety

Arxiv

0+阅读 · 2023年5月23日

Deepfake Text Detection in the Wild

Arxiv

0+阅读 · 2023年5月22日

EEG and EMG dataset for the detection of errors introduced by an active orthosis device

Arxiv

0+阅读 · 2023年5月19日

Persian Typographical Error Type Detection using Many-to-Many Deep Neural Networks on Algorithmically-Generated Misspellings

Arxiv

0+阅读 · 2023年5月19日

A Survey of Deep Graph Clustering: Taxonomy, Challenge, and Application

Arxiv

13+阅读 · 2022年11月23日

Advances in adversarial attacks and defenses in computer vision: A survey

Arxiv

22+阅读 · 2021年9月2日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

A Comprehensive Survey on Community Detection with Deep Learning

Arxiv

14+阅读 · 2021年5月26日

Recent Advances in Deep Learning-based Dialogue Systems

Arxiv

18+阅读 · 2021年5月10日

相关基金

YOD1促进非小细胞肺癌干细胞自我更新的靶蛋白鉴定及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度神经网络的自动作文评分算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

脉冲微分系统的极小周期与概周期问题

国家自然科学基金

0+阅读 · 2014年12月31日

基于二萜生物碱骨架特征以cortistatin A为先导化合物探索新型肿瘤血管生成抑制剂研究

国家自然科学基金

0+阅读 · 2013年12月31日

二亚硝基哌嗪（DNP）介导Clusterin表达参与鼻咽癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Ontology的藏文语料库检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

丁香假单胞菌番茄变种隐含的saxE基因影响病菌定植及致病性的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员