The vast majority of existing algorithms for unsupervised domain adaptation (UDA) focus on adapting from a labeled source domain to an unlabeled target domain directly in a one-off way. Gradual domain adaptation (GDA), on the other hand, assumes a path of $(T-1)$ unlabeled intermediate domains bridging the source and target, and aims to provide better generalization in the target domain by leveraging the intermediate ones. Under certain assumptions, Kumar et al. (2020) proposed a simple algorithm, Gradual Self-Training, along with a generalization bound in the order of $e^{O(T)} \left(\varepsilon_0+O\left(\sqrt{log(T)/n}\right)\right)$ for the target domain error, where $\varepsilon_0$ is the source domain error and $n$ is the data size of each domain. Due to the exponential factor, this upper bound becomes vacuous when $T$ is only moderately large. In this work, we analyze gradual self-training under more general and relaxed assumptions, and prove a significantly improved generalization bound as $\varepsilon_0+ O \left(T\Delta + T/\sqrt{n}\right) + \widetilde{O}\left(1/\sqrt{nT}\right)$, where $\Delta$ is the average distributional distance between consecutive domains. Compared with the existing bound with an exponential dependency on $T$ as a multiplicative factor, our bound only depends on $T$ linearly and additively. Perhaps more interestingly, our result implies the existence of an optimal choice of $T$ that minimizes the generalization error, and it also naturally suggests an optimal way to construct the path of intermediate domains so as to minimize the accumulative path length $T\Delta$ between the source and target. To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. We believe our insights provide a path forward toward the design of future GDA algorithms.
翻译:用于未监督域适应( UDA) 的绝大多数现有算法都侧重于将标签源域从标签源域直接调整为一成不变的目标域。 另一方面, 渐变域适应( GDA) 假设目标域错误的路径为$( T-1), 未标记的中间域将源和目标连接起来, 目的是通过利用中间域在目标域内提供更好的概括化。 根据某些假设, Kumar 等人 (2020) 提议了一个简单的算法, 渐进式自我培训, 以及一个按 $( T) 排序的概括性连接。 在这项工作中, 左(\ valepal- lideral- $( Val- lax) 直线性域), 左( more- dreal- dreadal- dreal- dreal- dreaddroad ), 右( tqral- diral- dreal- deal- deliver) 将我们的一般域的 和正al-ral- deal-ral- dismaisal- disal- us the the max- disl us us the usial- disal- dismusial- dismus the dismismusmusial- dismus the thes the dismus the dismus the dism thes thesl) rlusmusmusmusmismusmus 和 。 由于, 和 。 由于 和 = 美元 美元 美元 。 。由于 美元 和美元 美元 美元 和美元- dal_ diral_ dal_ dal_ disal_ disl disl disl disl disl disl disl disl disl 。 由于指数, 和美元=slisl dislisalisalisl i 一种 。 由于, 一种 一种 和美元 和美元 和美元 和美元 isl_ isalsl i i ial ial i i 和美元 i 一种 一种 一种