发现并校验 AI 错误, 使用众源失败报告 (Discovering and Validating AI Errors With Crowdsourced Failure Reports) - 专知论文

会员服务 ·

0

可辨认的 · CASES · MoDELS · 模型性能 · AI ·

2021 年 9 月 23 日

Discovering and Validating AI Errors With Crowdsourced Failure Reports

翻译：发现并校验 AI 错误, 使用众源失败报告

Ángel Alexander Cabrera,Abraham J. Druck,Jason I. Hong,Adam Perer

AI systems can fail to learn important behaviors, leading to real-world issues like safety concerns and biases. Discovering these systematic failures often requires significant developer attention, from hypothesizing potential edge cases to collecting evidence and validating patterns. To scale and streamline this process, we introduce crowdsourced failure reports, end-user descriptions of how or why a model failed, and show how developers can use them to detect AI errors. We also design and implement Deblinder, a visual analytics system for synthesizing failure reports that developers can use to discover and validate systematic failures. In semi-structured interviews and think-aloud studies with 10 AI practitioners, we explore the affordances of the Deblinder system and the applicability of failure reports in real-world settings. Lastly, we show how collecting additional data from the groups identified by developers can improve model performance.

翻译：AI系统可能无法学习重要的行为,导致安全关切和偏向等真实世界问题。发现这些系统性的失败往往需要开发者大力关注,从假设潜在边缘案例到收集证据和验证模式。为了扩大和简化这一过程,我们引入了多方源故障报告、终端用户描述模型如何失败或为什么失败,并展示开发者如何利用模型发现AI错误。我们还设计和实施了Debliner,这是一个视觉分析系统,用于合成开发者可用来发现和验证系统性失败的报告。在与10名AI实践者进行的半结构化访谈和智囊研究中,我们探索了Debliner系统的价格和故障报告在现实世界环境中的可适用性。最后,我们展示了如何从开发者确定的团体收集更多数据来改进模型性能。

0

相关内容

可辨认的

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

已删除

将门创投

7+阅读 · 2019年10月15日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

Words of Wisdom: Representational Harms in Learning From AI Communication

Words of Wisdom: Representational Harms in Learning From AI Communication

Arxiv

0+阅读 · 2021年11月16日

A repeated-measures study on emotional responses after a year in the pandemic

Arxiv

0+阅读 · 2021年11月16日

Developing a Prototype of a Mechanical Ventilator Controller from Requirements to Code with ASMETA

Arxiv

0+阅读 · 2021年11月16日

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Arxiv

0+阅读 · 2021年11月15日

Understanding the Information Needs and Practices of Human Supporters of an Online Mental Health Intervention to Inform Machine Learning Applications

Arxiv

0+阅读 · 2021年11月12日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Arxiv

3+阅读 · 2019年5月5日

Creativity Inspired Zero-Shot Learning

Arxiv

4+阅读 · 2019年4月3日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Arxiv

5+阅读 · 2018年7月23日

VIP会员

文章信息

相关主题

相关VIP内容

SIGIR2021接受论文列表公布！151篇论文都在这了！

专知会员服务

38+阅读 · 2021年4月27日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

已删除

将门创投

7+阅读 · 2019年10月15日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

相关论文

Words of Wisdom: Representational Harms in Learning From AI Communication

Words of Wisdom: Representational Harms in Learning From AI Communication

Arxiv

0+阅读 · 2021年11月16日

A repeated-measures study on emotional responses after a year in the pandemic

Arxiv

0+阅读 · 2021年11月16日

Developing a Prototype of a Mechanical Ventilator Controller from Requirements to Code with ASMETA

Arxiv

0+阅读 · 2021年11月16日

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Arxiv

0+阅读 · 2021年11月15日

Understanding the Information Needs and Practices of Human Supporters of an Online Mental Health Intervention to Inform Machine Learning Applications

Arxiv

0+阅读 · 2021年11月12日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Arxiv

3+阅读 · 2019年5月5日

Creativity Inspired Zero-Shot Learning

Arxiv

4+阅读 · 2019年4月3日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Arxiv

5+阅读 · 2018年7月23日

微信扫码咨询专知VIP会员