Backward compatibility of model predictions is a desired property when updating a machine learning driven application. It allows to seamlessly improve the underlying model without introducing regression bugs. In classification tasks these bugs occur in the form of negative flips. This means an instance that was correctly classified by the old model is now classified incorrectly by the updated model. This has direct negative impact on the user experience of such systems e.g. a frequently used voice assistant query is suddenly misclassified. A common reason to update the model is when new training data becomes available and needs to be incorporated. Simply retraining the model with the updated data introduces the unwanted negative flips. We study the problem of regression during data updates and propose Backward Compatible Weight Interpolation (BCWI). This method interpolates between the weights of the old and new model and we show in extensive experiments that it reduces negative flips without sacrificing the improved accuracy of the new model. BCWI is straight forward to implement and does not increase inference cost. We also explore the use of importance weighting during interpolation and averaging the weights of multiple new models in order to further reduce negative flips.
翻译:模型预测向后兼容性是更新机器学习驱动应用程序时的一种理想属性。 它允许在不引入回归错误的情况下无缝地改进基本模型。 在分类任务中,这些错误以负翻转的形式出现。 这意味着旧模型正确分类的例子现在被更新模型错误分类。 这直接影响到这些系统的用户经验,例如,经常使用的语音助理查询突然错误分类。 更新模型的一个常见原因是,当新的培训数据出现并需要纳入新的培训数据时。 仅仅根据更新的数据对模型进行再培训,就会引入不想要的负翻转。 我们在数据更新过程中研究回归问题,并提议采用后向可兼容的湿重内插法(BCWI) 。 这种方法将旧模型和新模型的权重相互交叉,我们在广泛的实验中显示,它会减少负翻转,而不会牺牲新模型的准确性。 BCWI 直往前去实施,不会增加推论成本。 我们还探索在内插和平均多个新模型的权重时使用权重,以便进一步减少负翻。