Test-time training provides a new approach solving the problem of domain shift. In its framework, a test-time training phase is inserted between training phase and test phase. During test-time training phase, usually parts of the model are updated with test sample(s). Then the updated model will be used in the test phase. However, utilizing test samples for test-time training has some limitations. Firstly, it will lead to overfitting to the test-time procedure thus hurt the performance on the main task. Besides, updating part of the model without changing other parts will induce a mismatch problem. Thus it is hard to perform better on the main task. To relieve above problems, we propose to use mixup in test-time training (MixTTT) which controls the change of model's parameters as well as completing the test-time procedure. We theoretically show its contribution in alleviating the mismatch problem of updated part and static part for the main task as a specific regularization effect for test-time training. MixTTT can be used as an add-on module in general test-time training based methods to further improve their performance. Experimental results show the effectiveness of our method.
翻译:测试时间培训提供了一种解决域变问题的新方法。 在其框架内, 在培训阶段和测试阶段之间插入一个测试时间培训阶段。 在测试时间培训阶段, 通常将模型的部分部分与测试样品更新。 然后, 更新的模型将在测试阶段使用。 但是, 将测试样本用于测试时间培训有一定的局限性。 首先, 它将导致过度适应测试时间程序, 从而损害主要任务的业绩。 此外, 更新部分模型而不改变其他部分将引起不匹配问题。 因此, 很难更好地完成主要任务。 为了缓解以上问题, 我们提议在测试时间培训中使用混合方法, 以控制模型参数的变化, 并完成测试时间程序。 我们理论上表明它有助于缓解主要任务中更新部分和静态部分的不匹配问题, 作为测试时间培训的具体规范效果。 MixTTT 可以作为基于一般测试时间培训方法的一个附加模块, 以进一步提高其性能。 实验结果显示我们的方法的有效性。