As deep learning becomes the mainstream in the field of natural language processing, the need for suitable active learning method are becoming unprecedented urgent. Active Learning (AL) methods based on nearest neighbor classifier are proposed and demonstrated superior results. However, existing nearest neighbor classifier are not suitable for classifying mutual exclusive classes because inter-class discrepancy cannot be assured by nearest neighbor classifiers. As a result, informative samples in the margin area can not be discovered and AL performance are damaged. To this end, we propose a novel Nearest neighbor Classifier with Margin penalty for Active Learning(NCMAL). Firstly, mandatory margin penalty are added between classes, therefore both inter-class discrepancy and intra-class compactness are both assured. Secondly, a novel sample selection strategy are proposed to discover informative samples within the margin area. To demonstrate the effectiveness of the methods, we conduct extensive experiments on for datasets with other state-of-the-art methods. The experimental results demonstrate that our method achieves better results with fewer annotated samples than all baseline methods.
翻译:随着深层次学习成为自然语言处理领域的主流,对合适的积极学习方法的需求正变得前所未有的迫切。 提出了以近邻分类法为基础的积极学习方法,并展示了优异的结果。 但是,现有的近邻分类法不适合对相互排他性分类法进行分类,因为最近的近邻分类法无法确保不同类别之间的差异。因此,无法发现边距区域的信息样本,而且AL性能受损。为此,我们提议了一个新型的近邻分类法,对积极学习的Margin惩罚(NCMAL) 。 首先,在各类别之间增加强制性的差幅处罚,因此,对不同类别之间的差异和等级内部的紧凑性都有保证。 其次,建议采用新的抽样选择战略在边距区域内发现信息样本。为了证明这些方法的有效性,我们用其他最先进的方法对数据集进行了广泛的实验。实验结果表明,我们的方法比所有基线方法都少加注样,效果更好。