对现实世界中不受欢迎的内容探测的综合办法 (A Holistic Approach to Undesired Content Detection in the Real World) - 专知论文

会员服务 ·

0

Taxonomy · 稳健性 · MoDELS · 值域 · 主动学习 ·

2023 年 2 月 14 日

A Holistic Approach to Undesired Content Detection in the Real World

翻译：对现实世界中不受欢迎的内容探测的综合办法

Todor Markov,Chong Zhang,Sandhini Agarwal,Tyna Eloundou,Teddy Lee,Steven Adler,Angela Jiang,Lilian Weng

from arxiv, Oral presentation at AAAI-23

We present a holistic approach to building a robust and useful natural language classification system for real-world content moderation. The success of such a system relies on a chain of carefully designed and executed steps, including the design of content taxonomies and labeling instructions, data quality control, an active learning pipeline to capture rare events, and a variety of methods to make the model robust and to avoid overfitting. Our moderation system is trained to detect a broad set of categories of undesired content, including sexual content, hateful content, violence, self-harm, and harassment. This approach generalizes to a wide range of different content taxonomies and can be used to create high-quality content classifiers that outperform off-the-shelf models.

翻译：我们提出了一个全面的方法来建立一个强大和有用的自然语言分类系统,以适应现实世界内容的节制。这样一个系统的成功取决于一系列精心设计和实施的步骤,包括内容分类和标签指示的设计、数据质量控制、记录稀有事件的积极学习管道以及使模型稳健和避免过度适应的各种方法。我们的节制系统受过培训,能够发现一系列广泛的不受欢迎的内容,包括性内容、仇恨内容、暴力、自我伤害和骚扰。这个方法概括了各种各样的内容分类,可以用来创建超越现成模式的高质量内容分类器。

0

相关内容

Taxonomy

分类学是分类的实践和科学。Wikipedia类别说明了一种分类法，可以通过自动方式提取Wikipedia类别的完整分类法。截至2009年，已经证明，可以使用人工构建的分类法（例如像WordNet这样的计算词典的分类法）来改进和重组Wikipedia类别分类法。从广义上讲，分类法还适用于除父子层次结构以外的关系方案，例如网络结构。然后分类法可能包括有多父母的单身孩子，例如，“汽车”可能与父母双方一起出现“车辆”和“钢结构”；但是对某些人而言，这仅意味着“汽车”是几种不同分类法的一部分。分类法也可能只是将事物组织成组，或者是按字母顺序排列的列表；但是在这里，术语词汇更合适。在知识管理中的当前用法中，分类法被认为比本体论窄，因为本体论应用了各种各样的关系类型。在数学上，分层分类法是给定对象集的分类树结构。该结构的顶部是适用于所有对象的单个分类，即根节点。此根下的节点是更具体的分类，适用于总分类对象集的子集。推理的进展从一般到更具体。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

孤独症儿童早期干预的同步TMS-EEG研究

国家自然科学基金

0+阅读 · 2017年12月31日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于表达水平、剪切机制、序列和结构的动物非编码RNA保守性与进化的系统分析

国家自然科学基金

0+阅读 · 2015年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

MMP9启动子去甲基化在早期糖尿病肾病足细胞表型转化中的调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

靶向运载siRNA调控肾癌VEGF Pre-mRNA选择性剪接的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

双极性树枝状蓝光PhOLED用Ir（Ⅲ）金属配合物的合成与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy

Arxiv

0+阅读 · 2023年4月5日

Side Channel-Assisted Inference Leakage from Machine Learning-based ECG Classification

Arxiv

0+阅读 · 2023年4月4日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年4月4日

The Vector Grounding Problem

Arxiv

0+阅读 · 2023年4月4日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Arxiv

20+阅读 · 2018年3月27日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用射频传感器载荷增强无人机的侦察、监视与目标获取（ISR）能力》报告

《导航战》2025最新报告

人工智能驱动的国防战术通信与网络：提升现代战争中的态势感知、安全性与自主决策 | 万字长文

《有人-无人轻型驱逐舰与中型无人水面艇支队在第二与第一岛链作战中的部署概念（CONOPS）》56页报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy

Arxiv

0+阅读 · 2023年4月5日

Side Channel-Assisted Inference Leakage from Machine Learning-based ECG Classification

Arxiv

0+阅读 · 2023年4月4日

Adversarial Detection: Attacking Object Detection in Real Time

Arxiv

0+阅读 · 2023年4月4日

The Vector Grounding Problem

Arxiv

0+阅读 · 2023年4月4日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

A Survey of Human-in-the-loop for Machine Learning

Arxiv

35+阅读 · 2021年8月2日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Arxiv

20+阅读 · 2018年3月27日

相关基金

孤独症儿童早期干预的同步TMS-EEG研究

国家自然科学基金

0+阅读 · 2017年12月31日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于表达水平、剪切机制、序列和结构的动物非编码RNA保守性与进化的系统分析

国家自然科学基金

0+阅读 · 2015年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

MMP9启动子去甲基化在早期糖尿病肾病足细胞表型转化中的调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

靶向运载siRNA调控肾癌VEGF Pre-mRNA选择性剪接的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

双极性树枝状蓝光PhOLED用Ir（Ⅲ）金属配合物的合成与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员