重新审视真实测试训练：基于锚定聚类正则化自训练的连续推断和适应性 (Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering Regularized Self-Training)

Deploying models on target domain data subject to distribution shift requires adaptation. Test-time training (TTT) emerges as a solution to this adaptation under a realistic scenario where access to full source domain data is not available, and instant inference on the target domain is required. Despite many efforts into TTT, there is a confusion over the experimental settings, thus leading to unfair comparisons. In this work, we first revisit TTT assumptions and categorize TTT protocols by two key factors. Among the multiple protocols, we adopt a realistic sequential test-time training (sTTT) protocol, under which we develop a test-time anchored clustering (TTAC) approach to enable stronger test-time feature learning. TTAC discovers clusters in both source and target domains and matches the target clusters to the source ones to improve adaptation. When source domain information is strictly absent (i.e. source-free) we further develop an efficient method to infer source domain distributions for anchored clustering. Finally, self-training~(ST) has demonstrated great success in learning from unlabeled data and we empirically figure out that applying ST alone to TTT is prone to confirmation bias. Therefore, a more effective TTT approach is introduced by regularizing self-training with anchored clustering, and the improved model is referred to as TTAC++. We demonstrate that, under all TTT protocols, TTAC++ consistently outperforms the state-of-the-art methods on five TTT datasets, including corrupted target domain, selected hard samples, synthetic-to-real adaptation and adversarially attacked target domain. We hope this work will provide a fair benchmarking of TTT methods, and future research should be compared within respective protocols.

翻译：---- 部署模型在存在分布转移（distribution shift）的目标域数据上需要进行适应性调整。测试时间训练 (TTT) 出现作为这种情况下一种解决方案，在实际场景中，源域数据是不完全存在的，并且需要在目标域上实时推理。尽管许多人致力于TTT，但是关于实验设置存在混淆，因此导致公平比较的困难。在这项工作中，我们首先重新审视TTT的假设并用两个关键因素对TTT协议进行分类。在多个协议中，我们采用了一种现实的连续测试时间训练（sTTT）协议，根据该协议，我们开发了一种测试时间锚定聚类（TTAC）方法以实现更强的测试时间特征学习。TTAC在源域和目标域中发现聚类并将目标聚类与源聚类匹配以提高适应性。当源域信息完全不存在时（即无源域信息），我们进一步开发了一种有效的方法来推断锚定聚类的源分布。最后，自我训练（ST）已经证明了从无标注数据中学习的巨大成功，并且我们通过经验发现仅将ST应用于TTT容易得出认知偏差。因此，通过将锚定聚类自正则化的自我训练，引入了更有效的TTT方法，我们称其为TTAC++。我们证明，在所有TTT协议下，TTAC++在五个TTT数据集上（包括目标域受损、选定难样本、合成到真实的适应和受到敌对攻击的目标域）始终胜过最新技术的方法。我们希望这项工作将为TTT方法提供公平的基准，并且未来的研究应在各自的协议内进行比较。