Classifier predictions often rely on the assumption that new observations come from the same distribution as training data. When the underlying distribution changes, so does the optimal classification rule, and performance may degrade. We consider the problem of detecting such a change in distribution in sequentially-observed, unlabeled classification data. We focus on label shift changes to the distribution, where the class priors shift but the class conditional distributions remain unchanged. We reduce this problem to the problem of detecting a change in the one-dimensional classifier scores, leading to simple nonparametric sequential changepoint detection procedures. Our procedures leverage classifier training data to estimate the detection statistic, and converge to their parametric counterparts in the size of the training data. In simulations, we show that our method outperforms other detection procedures in this label shift setting.
翻译:分类预测往往基于以下假设:新观测来自与培训数据相同的分布。当基础分布变化时,最佳分类规则也会发生,性能也会降低。我们考虑了在按顺序观察的、无标签的分类数据中检测出这种分布变化的问题。我们侧重于标签向分配变化的转移,在分类的前期转移但等级条件分布保持不变的地方,我们注重标签向分配的变化。我们将此问题降低到发现单维分类分分的变化问题,导致简单的非对称顺序更改点检测程序。我们的程序利用分类培训数据来估计检测统计数据,并与其在培训数据大小方面的参数对应方汇合。在模拟中,我们显示我们的方法优于这一标签转换设置中的其他检测程序。