In machine learning (ML) workflows, determining the invariance qualities of an ML model is a common testing procedure. Traditionally, invariance qualities are evaluated using simple formula-based scores, e.g., accuracy. In this paper, we show that testing the invariance qualities of ML models may result in complex visual patterns that cannot be classified using simple formulas. In order to test ML models by analyzing such visual patterns automatically using other ML models, we propose a systematic framework that is applicable to a variety of invariance qualities. We demonstrate the effectiveness and feasibility of the framework by developing ML4ML models (assessors) for determining rotation-, brightness-, and size-variances of a collection of neural networks. Our testing results show that the trained ML4ML assessors can perform such analytical tasks with sufficient accuracy.
翻译:在机器学习(ML)工作流程中,确定 ML 模式的变量质量是一个常见的测试程序,传统上,变量质量是通过简单的公式计分来评估的,例如准确性。在本文中,我们表明,测试 ML 模式的变量质量可能导致复杂的视觉模式,而不能使用简单公式进行分类。为了通过使用其他 ML 模型自动分析这种视觉模式来测试 ML 模型,我们提出了一个适用于各种变量的系统框架。我们通过开发 ML4ML 模型(评估器)来确定神经网络集的旋转、亮度和大小差异,来证明该框架的有效性和可行性。我们的测试结果表明,经过培训的 ML4ML 评估器能够以足够准确的方式完成这种分析任务。