Uncertainty sampling in active learning is heavily used in practice to reduce the annotation cost. However, there has been no wide consensus on the function to be used for uncertainty estimation in binary classification tasks and convergence guarantees of the corresponding active learning algorithms are not well understood. The situation is even more challenging for multi-category classification. In this work, we propose an efficient uncertainty estimator for binary classification which we also extend to multiple classes, and provide a non-asymptotic rate of convergence for our uncertainty sampling-based active learning algorithm in both cases under no-noise conditions (i.e., linearly separable data). We also extend our analysis to the noisy case and provide theoretical guarantees for our algorithm under the influence of noise in the task of binary and multi-class classification.
翻译:积极学习中的不确定性抽样在实践中被大量用于减少批注费用,但对于在二进制分类任务中用于不确定性估计的功能和相应的积极学习算法的趋同保障没有很好地理解,这种情况对多类分类更为困难。在这项工作中,我们建议为二进制分类提供一个有效的不确定性估计值,我们也将这一数值推广到多类分类,并为我们在两种情况下在无噪音条件下(即线性分解数据)的基于抽样的不确定性主动学习算法提供一个非抽取的趋同率。 我们还将分析扩大到吵闹的案件,并为我们在二进制分类和多级分类工作中受到噪音影响的算法提供理论担保。