Experiencing domain shifts during test-time is nearly inevitable in practice and likely results in a severe performance degradation. To overcome this issue, test-time adaptation continues to update the initial source model during deployment. A promising direction are methods based on self-training which have been shown to be well suited for gradual domain adaptation, since reliable pseudo-labels can be provided. In this work, we address two problems that exist when applying self-training in the setting of test-time adaptation. First, adapting a model to long test sequences that contain multiple domains can lead to error accumulation. Second, naturally, not all shifts are gradual in practice. To tackle these challenges, we introduce GTTA. By creating artificial intermediate domains that divide the current domain shift into a more gradual one, effective self-training through high quality pseudo-labels can be performed. To create the intermediate domains, we propose two independent variations: mixup and light-weight style transfer. We demonstrate the effectiveness of our approach on the continual and gradual corruption benchmarks, as well as ImageNet-R. To further investigate gradual shifts in the context of urban scene segmentation, we publish a new benchmark: CarlaTTA. It enables the exploration of several non-stationary domain shifts.
翻译:测试时发生域变换在实践中几乎不可避免,并可能导致严重性能退化。为了克服这一问题,测试时间适应在部署期间继续更新初始源模式。一个有希望的方向是基于自我培训的方法,这些方法已经证明非常适合逐步进行域变换,因为可以提供可靠的伪标签。在这项工作中,我们处理在测试时间适应设置方面进行自我培训时存在的两个问题。首先,将模型适应包含多个域的长测试序列可能导致错误积累。第二,自然,并非所有的变换都是逐步的。为了应对这些挑战,我们引入了GTTA。通过创建人为的中间域,将当前域变换分成一个更加渐进的域,通过高质量的伪标签可以进行有效的自我培训。为了创建中间域,我们提出了两种独立的变换:混合和轻量风格转换。我们展示了我们在持续和渐进的腐败基准上的方法的有效性,以及图像网络-R。为了进一步调查城市景区分割背景下的逐渐变换,我们公布了一个新的基准:卡拉特塔。它使一些非静止域得以探索。