Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks - 专知论文

会员服务 ·

0

Taxonomy · 知识 (knowledge) · 数据集 · Analysis · 有向 ·

2023 年 5 月 8 日

Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks

翻译：暂无翻译

Junyu Lu,Bo Xu,Xiaokun Zhang,Changrong Min,Liang Yang,Hongfei Lin

from arxiv, 13 pages, 4 figures. The paper has been accepted in ACL 2023

The widespread dissemination of toxic online posts is increasingly damaging to society. However, research on detecting toxic language in Chinese has lagged significantly. Existing datasets lack fine-grained annotation of toxic types and expressions, and ignore the samples with indirect toxicity. In addition, it is crucial to introduce lexical knowledge to detect the toxicity of posts, which has been a challenge for researchers. In this paper, we facilitate the fine-grained detection of Chinese toxic language. First, we built Monitor Toxic Frame, a hierarchical taxonomy to analyze toxic types and expressions. Then, a fine-grained dataset ToxiCN is presented, including both direct and indirect toxic samples. We also build an insult lexicon containing implicit profanity and propose Toxic Knowledge Enhancement (TKE) as a benchmark, incorporating the lexical feature to detect toxic language. In the experimental stage, we demonstrate the effectiveness of TKE. After that, a systematic quantitative and qualitative analysis of the findings is given.

翻译：暂无翻译

0

相关内容

Taxonomy

分类学是分类的实践和科学。Wikipedia类别说明了一种分类法，可以通过自动方式提取Wikipedia类别的完整分类法。截至2009年，已经证明，可以使用人工构建的分类法（例如像WordNet这样的计算词典的分类法）来改进和重组Wikipedia类别分类法。从广义上讲，分类法还适用于除父子层次结构以外的关系方案，例如网络结构。然后分类法可能包括有多父母的单身孩子，例如，“汽车”可能与父母双方一起出现“车辆”和“钢结构”；但是对某些人而言，这仅意味着“汽车”是几种不同分类法的一部分。分类法也可能只是将事物组织成组，或者是按字母顺序排列的列表；但是在这里，术语词汇更合适。在知识管理中的当前用法中，分类法被认为比本体论窄，因为本体论应用了各种各样的关系类型。在数学上，分层分类法是给定对象集的分类树结构。该结构的顶部是适用于所有对象的单个分类，即根节点。此根下的节点是更具体的分类，适用于总分类对象集的子集。推理的进展从一般到更具体。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

microRNA-155对口腔扁平苔藓Th细胞功能的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA在缺血再灌注致急性肾损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

高Al组分AlGaN应变量子结构制备与特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-320a靶向TIAM1抑制大肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

Arxiv

0+阅读 · 2023年6月21日

Using R for teaching and research

Arxiv

0+阅读 · 2023年6月21日

BMAD: Benchmarks for Medical Anomaly Detection

Arxiv

0+阅读 · 2023年6月20日

Designing Explainable Predictive Machine Learning Artifacts: Methodology and Practical Demonstration

Arxiv

0+阅读 · 2023年6月20日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于AI的动态任务分配策略实现多智能体系统有意义人类控制》报告

《超越连接：AI驱动网络未来愿景》最新报告

人工智能赋能多域作战：能力与挑战

《战场空间决策优势：AI基础与应用研究》总结报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

Arxiv

0+阅读 · 2023年6月21日

Using R for teaching and research

Arxiv

0+阅读 · 2023年6月21日

BMAD: Benchmarks for Medical Anomaly Detection

Arxiv

0+阅读 · 2023年6月20日

Designing Explainable Predictive Machine Learning Artifacts: Methodology and Practical Demonstration

Arxiv

0+阅读 · 2023年6月20日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

相关基金

microRNA-155对口腔扁平苔藓Th细胞功能的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA在缺血再灌注致急性肾损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

高Al组分AlGaN应变量子结构制备与特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-320a靶向TIAM1抑制大肠癌肝转移的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员