This study investigates the impact of adding noise to the training set classes in classification tasks using the BCOPS algorithm (Balanced and Conformal Optimized Prediction Sets), proposed by Guan & Tibshirani (2022). The BCOPS algorithm is an application of conformal prediction combined with a machine learning method to construct prediction sets such that the probability of the true class being included in the prediction set for a test observation meets a specified coverage guarantee. An observation is considered an outlier if its true class is not present in the training set. The study employs both synthetic and real datasets and conducts experiments to evaluate the prediction abstention rate for outlier observations and the model's robustness in this previously untested scenario. The results indicate that the addition of noise, even in small amounts, can have a significant effect on model performance.
翻译:本研究探讨了在分类任务中向训练集类别添加噪声对BCOPS算法(Balanced and Conformal Optimized Prediction Sets,由Guan & Tibshirani于2022年提出)性能的影响。BCOPS算法是一种结合了保形预测与机器学习方法的应用,旨在构建预测集合,使得测试观测的真实类别被包含在预测集合中的概率满足特定的覆盖保证。若某观测的真实类别未出现在训练集中,则该观测被视为异常值。本研究采用合成数据集与真实数据集进行实验,评估了异常观测的预测弃权率以及模型在此前未经测试场景下的鲁棒性。结果表明,即使添加少量噪声,也会对模型性能产生显著影响。