Test-time adaptation is the problem of adapting a source pre-trained model using test inputs from a target domain without access to source domain data. Most of the existing approaches address the setting in which the target domain is stationary. Moreover, these approaches are prone to making erroneous predictions with unreliable uncertainty estimates when distribution shifts occur. Hence, test-time adaptation in the face of non-stationary target domain shift becomes a problem of significant interest. To address these issues, we propose a principled approach, PETAL (Probabilistic lifElong Test-time Adaptation with seLf-training prior), which looks into this problem from a probabilistic perspective using a partly data-dependent prior. A student-teacher framework, where the teacher model is an exponential moving average of the student model naturally emerges from this probabilistic perspective. In addition, the knowledge from the posterior distribution obtained for the source task acts as a regularizer. To handle catastrophic forgetting in the long term, we also propose a data-driven model parameter resetting mechanism based on the Fisher information matrix (FIM). Moreover, improvements in experimental results suggest that FIM based data-driven parameter restoration contributes to reducing the error accumulation and maintaining the knowledge of recent domain by restoring only the irrelevant parameters. In terms of predictive error rate as well as uncertainty based metrics such as Brier score and negative log-likelihood, our method achieves better results than the current state-of-the-art for online lifelong test time adaptation across various benchmarks, such as CIFAR-10C, CIFAR-100C, ImageNetC, and ImageNet3DCC datasets.
翻译:测试时间适应是使用目标域的测试输入而无需获取源域数据来调整源源预培训模型的问题。大多数现有方法都针对目标域处于固定状态的设置。此外,这些方法容易做出错误的预测,在分布变化时,不确定性估计值不可靠。因此,面对非静止目标域变换,测试时间适应成为一个引起极大兴趣的问题。为了解决这些问题,我们建议采用原则性方法PETAL(在SELf培训之前使用SeLf培训进行100次周期性测试调整),从概率的角度研究这一问题,使用部分基于数据的网络域域;此外,学生-教师框架,教师模型是学生模型的指数移动平均数,自然地从这种概率变化的角度出现。此外,在非静止目标目标域变换方面获得的海边分布知识成了一个固定因素。为了长期处理灾难性的遗忘,我们还提议以SFerish信息矩阵为基础,以数据驱动的模型参数重新设定机制。 此外,实验结果表明,基于FIM的当前图像测试基准值3,作为基于最新时间测试参数的精确度指标的精确度指标,作为恢复率,作为基于最近的精确度测算法的精确度的精确度的模型,作为降低的精确度的精确度的精确度,作为降低率的精确度的精确度,作为降低率的精确度的精确度的精确度,作为降低的精确度的精确度的精确度的精确度的精确度,作为降低。