Machine learning models are increasingly used in practice. However, many machine learning methods are sensitive to test or operational data that is dissimilar to training data. Out-of-distribution (OOD) data is known to increase the probability of error and research into metrics that identify what dissimilarities in data affect model performance is on-going. Recently, combinatorial coverage metrics have been explored in the literature as an alternative to distribution-based metrics. Results show that coverage metrics can correlate with classification error. However, other results show that the utility of coverage metrics is highly dataset-dependent. In this paper, we show that this dataset-dependence can be alleviated with metric learning, a machine learning technique for learning latent spaces where data from different classes is further apart. In a study of 6 open-source datasets, we find that metric learning increased the difference between set-difference coverage metrics (SDCCMs) calculated on correctly and incorrectly classified data, thereby demonstrating that metric learning improves the ability of SDCCMs to anticipate classification error. Paired t-tests validate the statistical significance of our findings. Overall, we conclude that metric learning improves the ability of coverage metrics to anticipate classifier error and identify when OOD data is likely to degrade model performance.
翻译:然而,许多机器学习方法都对测试或操作数据敏感,而这些数据与培训数据不同。据了解,超出分布数据(OOOD)数据可以增加错误的概率,并研究有助于确定数据差异影响模型性能的衡量尺度。最近,在文献中探索了组合覆盖指标,以替代基于分布的衡量尺度。结果显示,覆盖指标可以与分类错误相联系。但是,其他结果显示,覆盖指标的效用高度依赖数据设置。在本文件中,我们表明,这种数据集依赖性可以通过计量学习得到缓解,这是一种学习潜在空间的机器学习技术,在不同类别的数据进一步分离。在对6个公开源数据集进行的一项研究中,我们发现,衡量学习增加了根据正确和错误分类数据计算的设定差异指标(SDCCMs)之间的差异,从而表明,衡量指标学习提高了SDCCMs预测分类误差的能力。在本文件中,测试证实了我们调查结果的统计意义。总体而言,我们的结论是,在衡量模型质量时,我们估计了OD的衡量质量的能力。</s>