Prediction failures of machine learning models often arise from deficiencies in training data, such as incorrect labels, outliers, and selection biases. However, such data points that are responsible for a given failure mode are generally not known a priori, let alone a mechanism for repairing the failure. This work draws on the Bayesian view of continual learning, and develops a generic framework for both, identifying training examples that have given rise to the target failure, and fixing the model through erasing information about them. This framework naturally allows leveraging recent advances in continual learning to this new problem of model repairment, while subsuming the existing works on influence functions and data deletion as specific instances. Experimentally, the proposed approach outperforms the baselines for both identification of detrimental training data and fixing model failures in a generalisable manner.
翻译:机床学习模型的预测失败往往产生于培训数据方面的缺陷,如不正确的标签、离线和选择偏差等。然而,对某一失败模式负有责任的这类数据点通常并不先验地为人所知,更不用说修复失败的机制了。这项工作借鉴了巴伊西亚人对持续学习的看法,为两者制定了一个通用框架,确定了导致目标失败的培训实例,并通过消除有关这些实例的信息来修正模型。这个框架自然可以利用最近不断学习的进展来应对这个新的模型修复问题,同时将关于影响功能和数据删除的现有工作作为具体实例进行。 实验性地,拟议的方法超越了确定有害培训数据的基线和以一般方式确定模式失败的基线。