In this paper, we proposed a new clustering-based active learning framework, namely Active Learning using a Clustering-based Sampling (ALCS), to address the shortage of labeled data. ALCS employs a density-based clustering approach to explore the cluster structure from the data without requiring exhaustive parameter tuning. A bi-cluster boundary-based sample query procedure is introduced to improve the learning performance for classifying highly overlapped classes. Additionally, we developed an effective diversity exploration strategy to address the redundancy among queried samples. Our experimental results justified the efficacy of the ALCS approach.
翻译:在本文中,我们提出了一个新的基于集群的积极学习框架,即利用基于集群的抽样(ALCS)进行积极学习,以解决标签数据短缺的问题;ALCS采用基于密度的集群方法,从数据中探索集群结构,而无需详尽的参数调整;采用了基于边界的双集群抽样查询程序,以改进高度重叠类的分类学习绩效;此外,我们制定了有效的多样性探索战略,以解决查询样本中的冗余问题。我们的实验结果证明ALCS方法的功效合理。