Test-time adaptation (TTA) is an emerging paradigm that addresses distributional shifts between training and testing phases without additional data acquisition or labeling cost; only unlabeled test data streams are used for continual model adaptation. Previous TTA schemes assume that the test samples are independent and identically distributed (i.i.d.), even though they are often temporally correlated (non-i.i.d.) in application scenarios, e.g., autonomous driving. We discover that most existing TTA methods fail dramatically under such scenarios. Motivated by this, we present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams. Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner. Our evaluation with various datasets, including real-world non-i.i.d. streams, demonstrates that the proposed robust TTA not only outperforms state-of-the-art TTA algorithms in the non-i.i.d. setting, but also achieves comparable performance to those algorithms under the i.i.d. assumption.
翻译:测试时间适应(TTA)是解决培训和测试阶段之间在不增加数据获取或标签成本的情况下分配变化的新兴范例;只有未贴标签的测试数据流用于持续的模型适应。以前的TTA计划假定测试样本是独立和相同分布的(i.d.),尽管在应用情景中,例如自主驱动,测试时间适应(non-i.d.)往往具有时间关联性(non-i.d.),但我们发现,大多数现有的TTA方法在这种情景下都严重失灵。我们发现,在这种情景下,我们提出一种新的测试时间适应计划是针对非i.i.d.测试数据流的稳健的。我们的新做法主要有两个方面:(a) 样板批量正常化(IA.BN),可以纠正分配之外样本的常规化(i.d.),(b) 预测平衡的缓冲抽样(PBRS),以模拟从非i.i.d.d.流到类平衡流的数据流。我们用各种数据集进行的评估,包括真实的i-i-i.d.d.add.trad.stal-trad.st-trad.s),也显示不稳型的TA.st-trad.st-trag.st.st.