Unsupervised domain adaptation (UDA) enables a learning machine to adapt from a labeled source domain to an unlabeled domain under the distribution shift. Thanks to the strong representation ability of deep neural networks, recent remarkable achievements in UDA resort to learning domain-invariant features. Intuitively, the hope is that a good feature representation, together with the hypothesis learned from the source domain, can generalize well to the target domain. However, the learning processes of domain-invariant features and source hypothesis inevitably involve domain-specific information that would degrade the generalizability of UDA models on the target domain. In this paper, motivated by the lottery ticket hypothesis that only partial parameters are essential for generalization, we find that only partial parameters are essential for learning domain-invariant information and generalizing well in UDA. Such parameters are termed transferable parameters. In contrast, the other parameters tend to fit domain-specific details and often fail to generalize, which we term as untransferable parameters. Driven by this insight, we propose Transferable Parameter Learning (TransPar) to reduce the side effect brought by domain-specific information in the learning process and thus enhance the memorization of domain-invariant information. Specifically, according to the distribution discrepancy degree, we divide all parameters into transferable and untransferable ones in each training iteration. We then perform separate updates rules for the two types of parameters. Extensive experiments on image classification and regression tasks (keypoint detection) show that TransPar outperforms prior arts by non-trivial margins. Moreover, experiments demonstrate that TransPar can be integrated into the most popular deep UDA networks and be easily extended to handle any data distribution shift scenarios.
翻译:不受监督的域适应( UDA) 使学习机器能够从标签源域域转变为分布式转换的未标记域。 由于深神经网络的强大代表能力, UDA最近取得显著成就, 学习域异性特性。 直观地说, 希望良好的地貌表现, 加上从源域学的假设, 能够向目标域推广。 然而, 域异性特征和源假设的学习过程必然涉及特定域信息, 这会降低UDA模型在目标域域的通用差值。 在本文中, 彩票票假设只有部分参数才对总体化至关重要, 我们发现只有部分参数是学习域异性信息和在 UDA 中全面化。 这些参数被称为可转移参数。 相比之下, 其他参数往往适合特定域别的细节, 并且往往无法概括, 我们称之为不可转移的参数。 受此观察的驱动, 我们提议可转移的 Parater 学习( Transperalal) 来减少域域域域域域域域域域变变的参数带来的非侧效应。, 将所有域域域域域域域域变变变变, 显示前的域变变变的域的域域域域域域域域域域域域变变变变, 将所有的域变的域变的域变法, 和变变变变的域变法, 将我变的域变的域变的域变变的域变的域变法, 将我变的域变的域变的域变法, 将我法, 变变法, 变法, 变法, 变法, 变法, 变法, 变法, 变法, 变变变法, 将所有的变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变法 变