Despite the emergence of principled methods for domain adaptation under label shift, the sensitivity of these methods for minor shifts in the class conditional distributions remains precariously under explored. Meanwhile, popular deep domain adaptation heuristics tend to falter when faced with shifts in label proportions. While several papers attempt to adapt these heuristics to accommodate shifts in label proportions, inconsistencies in evaluation criteria, datasets, and baselines, make it hard to assess the state of the art. In this paper, we introduce RLSbench, a large-scale relaxed label shift benchmark, consisting of >500 distribution shift pairs that draw on 14 datasets across vision, tabular, and language modalities and compose them with varying label proportions. First, we evaluate 13 popular domain adaptation methods, demonstrating more widespread failures under label proportion shifts than were previously known. Next, we develop an effective two-step meta-algorithm that is compatible with most deep domain adaptation heuristics: (i) pseudo-balance the data at each epoch; and (ii) adjust the final classifier with (an estimate of) target label distribution. The meta-algorithm improves existing domain adaptation heuristics often by 2--10\% accuracy points under extreme label proportion shifts and has little (i.e., <0.5\%) effect when label proportions do not shift. We hope that these findings and the availability of RLSbench will encourage researchers to rigorously evaluate proposed methods in relaxed label shift settings. Code is publicly available at https://github.com/acmi-lab/RLSbench.
翻译:尽管在标签转换过程中出现了对域适应的原则性方法,但这些方法对于等级有条件分布的细变的敏感性仍然在探讨之中。与此同时,在面临标签比例变化时,流行的深域适应超自然现象往往会减弱。虽然有几份文件试图调整这些超自然现象,以适应标签比例的变化、评价标准、数据集和基线方面的不一致性,从而难以评估艺术的状态。在本文中,我们引入了大型的放松标签转换基准RLSbench,即大型的放松标签转换基准,由 > 500个分布变换对组成,利用14个分布在视觉、表格和语言上的数据集,并以不同的标签比例组成。首先,我们评估了13个流行的广域适应方法,显示标签比例变化比以前已知的更加普遍的失败。接下来,我们开发了一个与最深域适应程度相符的有效两步的元变数。 (i) 提议的每条标签的宽松度;以及(ii) 最终的分类分解,包括(估计) 标签分布分布、表和语言模式的14个分布。首先,我们评估13个流行的域调整方法显示现有区域变差的精确度。