自动处理器:用于 ML 模型分析的可缩放自动数据切分 (AutoSlicer: Scalable Automated Data Slicing for ML Model Analysis) - 专知论文

会员服务 ·

0

Automator · 机器学习建模 · MoDELS · Analysis · 可辨认的 ·

2022 年 12 月 18 日

AutoSlicer: Scalable Automated Data Slicing for ML Model Analysis

翻译：自动处理器:用于 ML 模型分析的可缩放自动数据切分

Zifan Liu,Evan Rosen,Paul Suganthan G. C

from arxiv, 11 pages, 5 figures, NeurIPS 2022 Workshop on Challenges in Deploying and Monitoring Machine Learning Systems

Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison, as well as the diagnosis of fairness issues. Scalability has become a critical requirement for any automated slicing system due to the large search space of possible slices and the growing scale of data. We present Autoslicer, a scalable system that searches for problematic slices through distributed metric computation and hypothesis testing. We develop an efficient strategy that reduces the search space through pruning and prioritization. In the experiments, we show that our search strategy finds most of the anomalous slices by inspecting a small portion of the search space.

翻译：自动切片旨在确定评价数据子集,一个受过训练的模型在哪些方面是无声的。这是生产中的机器学习管道的一个重要问题,因为它在模型调试和比较以及公正问题的诊断方面起着关键作用。由于可能的切片的搜索空间巨大和数据规模不断扩大,可扩缩已成为任何自动切片系统的关键要求。我们介绍了自动切片系统,这是一个可扩缩的系统,通过分布式计量计算和假设测试来搜索有问题的切片。我们制定了有效的战略,通过裁剪和优先排序减少搜索空间。在实验中,我们通过检查一小部分搜索空间,表明我们的搜索战略发现了大部分异常切片。

0

相关内容

Automator

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

PP2Cδ调控的线粒体ROS通路在肺损伤和炎症中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miRNAs/mTOR调控网络在糖尿病雷帕霉素抵抗中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Dicer在先天性巨结肠发病中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

柑橘黄龙病亚洲种病原( Cadidatus Liberibacter assiaticus)重组抗体的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型PMN-PT基铁电光学陶瓷制备及其高电光特性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

UBTD1调控p53蛋白表达的机制及功能

国家自然科学基金

0+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data (Extended version)

Arxiv

0+阅读 · 2023年2月17日

KuberneTSN: a Deterministic Overlay Network for Time-Sensitive Containerized Environments

Arxiv

0+阅读 · 2023年2月16日

MoPeDT: A Modular Head-Mounted Display Toolkit to Conduct Peripheral Vision Research

Arxiv

0+阅读 · 2023年2月16日

A Wasserstein distance-based spectral clustering method for transaction data analysis

A Wasserstein distance-based spectral clustering method for transaction data analysis

Arxiv

0+阅读 · 2023年2月16日

EvoX: A Distributed GPU-accelerated Library towards Scalable Evolutionary Computation

Arxiv

0+阅读 · 2023年2月16日

On 2-strong connectivity orientations of mixed graphs and related problems

Arxiv

0+阅读 · 2023年2月15日

AI/ML Algorithms and Applications in VLSI Design and Technology

Arxiv

1+阅读 · 2023年2月15日

Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge

Arxiv

0+阅读 · 2023年2月13日

AI for Next Generation Computing: Emerging Trends and Future Directions

Arxiv

19+阅读 · 2022年3月5日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

VIP会员

文章信息

相关主题

机器学习建模

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data (Extended version)

Arxiv

0+阅读 · 2023年2月17日

KuberneTSN: a Deterministic Overlay Network for Time-Sensitive Containerized Environments

Arxiv

0+阅读 · 2023年2月16日

MoPeDT: A Modular Head-Mounted Display Toolkit to Conduct Peripheral Vision Research

Arxiv

0+阅读 · 2023年2月16日

A Wasserstein distance-based spectral clustering method for transaction data analysis

A Wasserstein distance-based spectral clustering method for transaction data analysis

Arxiv

0+阅读 · 2023年2月16日

EvoX: A Distributed GPU-accelerated Library towards Scalable Evolutionary Computation

Arxiv

0+阅读 · 2023年2月16日

On 2-strong connectivity orientations of mixed graphs and related problems

Arxiv

0+阅读 · 2023年2月15日

AI/ML Algorithms and Applications in VLSI Design and Technology

Arxiv

1+阅读 · 2023年2月15日

Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge

Arxiv

0+阅读 · 2023年2月13日

AI for Next Generation Computing: Emerging Trends and Future Directions

Arxiv

19+阅读 · 2022年3月5日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

相关基金

PP2Cδ调控的线粒体ROS通路在肺损伤和炎症中的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

miRNAs/mTOR调控网络在糖尿病雷帕霉素抵抗中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Dicer在先天性巨结肠发病中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

柑橘黄龙病亚洲种病原( Cadidatus Liberibacter assiaticus)重组抗体的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型PMN-PT基铁电光学陶瓷制备及其高电光特性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

UBTD1调控p53蛋白表达的机制及功能

国家自然科学基金

0+阅读 · 2011年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员