Few-shot Fine-tuning是无需源域数据自适应的最佳方案 (Few-shot Fine-tuning is All You Need for Source-free Domain Adaptation)

Recently, source-free unsupervised domain adaptation (SFUDA) has emerged as a more practical and feasible approach compared to unsupervised domain adaptation (UDA) which assumes that labeled source data are always accessible. However, significant limitations associated with SFUDA approaches are often overlooked, which limits their practicality in real-world applications. These limitations include a lack of principled ways to determine optimal hyperparameters and performance degradation when the unlabeled target data fail to meet certain requirements such as a closed-set and identical label distribution to the source data. All these limitations stem from the fact that SFUDA entirely relies on unlabeled target data. We empirically demonstrate the limitations of existing SFUDA methods in real-world scenarios including out-of-distribution and label distribution shifts in target data, and verify that none of these methods can be safely applied to real-world settings. Based on our experimental results, we claim that fine-tuning a source pretrained model with a few labeled data (e.g., 1- or 3-shot) is a practical and reliable solution to circumvent the limitations of SFUDA. Contrary to common belief, we find that carefully fine-tuned models do not suffer from overfitting even when trained with only a few labeled data, and also show little change in performance due to sampling bias. Our experimental results on various domain adaptation benchmarks demonstrate that the few-shot fine-tuning approach performs comparatively under the standard SFUDA settings, and outperforms comparison methods under realistic scenarios. Our code is available at https://github.com/daintlab/fewshot-SFDA .

翻译：近来，无需源域标签数据的自适应方法（SFUDA）已成为比使用无监督自适应方法（UDA）更实用和可行的方法，UDA需要源域标注数据。然而，人们经常忽视与SFUDA方法相关的重要限制，这限制了它们在实际应用中的实用性。这些限制包括缺乏确定最佳超参数的原则和性能下降，当未标记的目标数据未满足某些条件时（例如，与源数据相同的闭合集和标签分布），这一点也是从未标记的目标数据中是SFUDA的全部根源。在我们的实验中，我们在真实世界场景中展示了现有SFUDA方法的限制，比如目标数据中的分布偏移和标签分布偏移。我们验证了这些方法都不能在实际场景下安全的使用。基于我们的实验结果，我们声称，使用少量的标记数据（例如1或3-shot）对源先前训练的模型进行微调是规避SFUDA的限制的一种实用且可靠的解决方案。与常见的观念相反，我们发现精心微调的模型即使只使用少量的标记数据进行训练，也不会遭受过拟合的问题，并且由于采样偏差而出现的性能变化很小。我们在各种域自适应基准测试上的实验结果表明，少量样本的微调方法在标准SFUDA环境下表现出色，并在现实情况下优于比较方法。我们的代码可在https://github.com/daintlab/fewshot-SFDA获得。