We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant. OSLS subsumes domain adaptation under label shift and Positive-Unlabeled (PU) learning. The learner's goals here are two-fold: (a) estimate the target label distribution, including the novel class; and (b) learn a target classifier. First, we establish necessary and sufficient conditions for identifying these quantities. Second, motivated by advances in label shift and PU learning, we propose practical methods for both tasks that leverage black-box predictors. Unlike typical Open Set Domain Adaptation (OSDA) problems, which tend to be ill-posed and amenable only to heuristics, OSLS offers a well-posed problem amenable to more principled machinery. Experiments across numerous semi-synthetic benchmarks on vision, language, and medical datasets demonstrate that our methods consistently outperform OSDA baselines, achieving 10--25% improvements in target domain accuracy. Finally, we analyze the proposed methods, establishing finite-sample convergence to the true label marginal and convergence to optimal classifier for linear models in a Gaussian setup. Code is available at https://github.com/acmi-lab/Open-Set-Label-Shift.
翻译:我们在 Open Set Label Shift (OSLS) (OSLS) 下引入域适应问题, 标签分配可以任意改变, 新的等级可能部署时会到达, 但等级条件分配 p(x) 却是域性异性。 OSLS 以标签转换和积极- 未加标签( PU) 学习为模式, 将域调整。 学习者的目标有两个方面:(a) 估计目标标签分布, 包括新类; 学习一个目标分类。 首先, 我们为确定这些数量建立必要和充分的条件。 其次, 由标签转换和 PU 学习的进步推动, 我们为两种任务提出了切实可行的方法, 利用黑箱预测器。 与典型的 Open Set Domain 适应( ODADADA) 问题不同, 这些问题往往存在错误, 并且只适合外观性。 OSLS 提供一个更明确的机制。 在视觉、 语言和医学数据集中进行实验, 我们提出的方法始终超越 OSAD 基准, 实现 10- 至 25% 目标域统一 的精确性标准 。 最后, 我们分析了标准 至 至 至 至 正在建立一个最高级的升级 的 的 的 标准 。