少数族裔以少数族裔为主的为不平衡数据集积极学习 (Minority Class Oriented Active Learning for Imbalanced Datasets)

Active learning aims to optimize the dataset annotation process when resources are constrained. Most existing methods are designed for balanced datasets. Their practical applicability is limited by the fact that a majority of real-life datasets are actually imbalanced. Here, we introduce a new active learning method which is designed for imbalanced datasets. It favors samples likely to be in minority classes so as to reduce the imbalance of the labeled subset and create a better representation for these classes. We also compare two training schemes for active learning: (1) the one commonly deployed in deep active learning using model fine tuning for each iteration and (2) a scheme which is inspired by transfer learning and exploits generic pre-trained models and train shallow classifiers for each iteration. Evaluation is run with three imbalanced datasets. Results show that the proposed active learning method outperforms competitive baselines. Equally interesting, they also indicate that the transfer learning training scheme outperforms model fine tuning if features are transferable from the generic dataset to the unlabeled one. This last result is surprising and should encourage the community to explore the design of deep active learning methods.

翻译：积极学习的目的是在资源受限时优化数据集注释过程。多数现有方法是为平衡数据集设计的。它们的实际适用性有限, 原因是大多数实际存在的数据集实际上不平衡。在这里, 我们引入了一种新的主动学习方法, 是为不平衡的数据集设计的。它有利于可能属于少数类的样本, 以减少标签子集的不平衡, 并为这些类创造更好的代表性。我们还比较了两种积极学习培训计划:(1) 通常在深层积极学习中采用的方法, 使用对每个迭代的微调模式进行微调; (2) 一种由转移学习所启发的计划, 并开发通用的预培训模型, 以及培训每个迭代的浅层分类器。评估用三种不平衡的数据集进行。结果显示, 拟议的积极学习方法比竞争性基线要强。同样有趣的是, 它们也表明, 传输学习培训计划比模型的微调差, 如果特性从通用数据集转移到未标的数据集。最后的结果是惊人的, 并且应该鼓励社区探索深层积极的学习方法的设计。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日