Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison, as well as the diagnosis of fairness issues. Scalability has become a critical requirement for any automated slicing system due to the large search space of possible slices and the growing scale of data. We present Autoslicer, a scalable system that searches for problematic slices through distributed metric computation and hypothesis testing. We develop an efficient strategy that reduces the search space through pruning and prioritization. In the experiments, we show that our search strategy finds most of the anomalous slices by inspecting a small portion of the search space.
翻译:自动切片旨在确定评价数据子集,一个受过训练的模型在哪些方面是无声的。这是生产中的机器学习管道的一个重要问题,因为它在模型调试和比较以及公正问题的诊断方面起着关键作用。由于可能的切片的搜索空间巨大和数据规模不断扩大,可扩缩已成为任何自动切片系统的关键要求。我们介绍了自动切片系统,这是一个可扩缩的系统,通过分布式计量计算和假设测试来搜索有问题的切片。我们制定了有效的战略,通过裁剪和优先排序减少搜索空间。在实验中,我们通过检查一小部分搜索空间,表明我们的搜索战略发现了大部分异常切片。