高效主动学习与弃权机制 (Efficient Active Learning with Abstention)

The goal of active learning is to achieve the same accuracy achievable by passive learning, while using much fewer labels. Exponential savings in terms of label complexity have been proved in very special cases, but fundamental lower bounds show that such improvements are impossible in general. This suggests a need to explore alternative goals for active learning. Learning with abstention is one such alternative. In this setting, the active learning algorithm may abstain from prediction and incur an error that is marginally smaller than random guessing. We develop the first computationally efficient active learning algorithm with abstention. Our algorithm provably achieves $\mathsf{polylog}(\frac{1}{\varepsilon})$ label complexity, without any low noise conditions. Such performance guarantee reduces the label complexity by an exponential factor, relative to passive learning and active learning that is not allowed to abstain. Furthermore, our algorithm is guaranteed to only abstain on hard examples (where the true label distribution is close to a fair coin), a novel property we term proper abstention that also leads to a host of other desirable characteristics (e.g., recovering minimax guarantees in the standard setting, and avoiding the undesirable "noise-seeking" behavior often seen in active learning). We also provide novel extensions of our algorithm that achieve constant label complexity and deal with model misspecification.

翻译：主动学习的目标是在使用远少于被动学习所需标注数量的情况下，达到与被动学习相同的准确率。在极特殊情况下，已有研究证明标注复杂度可实现指数级节省，但基础下界表明此类改进通常无法实现。这提示我们需要探索主动学习的替代目标。带弃权的学习正是这样一种替代方案。在此设定下，主动学习算法可选择弃权预测，此时产生的误差略小于随机猜测。我们提出了首个具有计算效率的带弃权主动学习算法。该算法在无需任何低噪声条件的情况下，可证明达到$\mathsf{polylog}(\frac{1}{\varepsilon})$的标注复杂度。相较于被动学习以及不允许弃权的主动学习，该性能保证将标注复杂度降低了指数级。此外，我们的算法保证仅对困难样本（其实标签分布接近均匀硬币分布）进行弃权，这一我们称为“恰当弃权”的新颖特性同时带来了一系列其他理想特性（例如：在标准设定中恢复极小极大保证，避免主动学习中常见的“噪声追逐”不良行为）。我们还提出了该算法的新扩展形式，可实现恒定标注复杂度并处理模型设定错误问题。

相关内容

主动学习

关注 241

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【ICML2024】社区不变图对比学习

专知会员服务

24+阅读 · 2024年5月4日

【ICML2021】模仿学习的超参数选择

专知会员服务

22+阅读 · 2021年5月27日