Bayesian Bassian 批量主动学习, 将其作为粗略次集近似度 (Bayesian Batch Active Learning as Sparse Subset Approximation)

Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks.

翻译：利用近年来产生的大量未贴标签数据为改进受监督的模型提供了巨大的潜力。当获取标签的成本很高时,可以使用概率积极的学习方法贪婪地选择需要贴上标签的最丰富信息的数据点。然而,对于许多大规模问题,标准的贪婪程序在计算上变得不可行,并受到微不足道的模式变化的影响。在本文中,我们采用了一种小说贝叶西亚分批积极的学习方法,以缓解这些问题。我们的方法的动机是接近模型参数的完整数据后遗症。虽然幼稚的批量构建方法产生了相互关联的查询,但我们的算法产生了不同的批量,使得能够大规模有效地积极学习。我们得出类似于现有的线性模型现行积极学习程序的可解释的封闭式解决方案,并使用随机预测来概括任意模式。我们展示了我们在若干大规模回归和分类任务上的方法的好处。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

专知会员服务

39+阅读 · 2020年11月3日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅