We address the problem of active learning under label shift: when the class proportions of source and target domains differ. We introduce a "medial distribution" to incorporate a tradeoff between importance weighting and class-balanced sampling and propose their combined usage in active learning. Our method is known as Mediated Active Learning under Label Shift (MALLS). It balances the bias from class-balanced sampling and the variance from importance weighting. We prove sample complexity and generalization guarantees for MALLS which show active learning reduces asymptotic sample complexity even under arbitrary label shift. We empirically demonstrate MALLS scales to high-dimensional datasets and can reduce the sample complexity of active learning by 60% in deep active learning tasks.
翻译:我们处理标签转换下的积极学习问题:当源和目标域的类别比例不同时。我们引入了“中位分布 ”, 以纳入重要性加权和类平衡抽样的权衡, 并提议在主动学习中使用这些抽样。 我们的方法被称为“ 标签转换下的中位积极学习 ” ( MALLS ) 。 它平衡了与类平衡抽样的偏向和比重的差异。 我们证明,对于显示积极学习的MALLS 来说,其抽样复杂性和一般化保障可以减少无症状样本的复杂性,即使在任意的标签转换下也是如此。 我们用经验将MALLS 尺度显示为高维数据集,并可以将积极学习的抽样复杂性减少60%,用于深层积极的学习任务。