Active Learning (AL) is a powerful tool to address modern machine learning problems with significantly fewer labeled training instances. However, implementation of traditional AL methodologies in practical scenarios is accompanied by multiple challenges due to the inherent assumptions. There are several hindrances, such as unavailability of labels for the AL algorithm at the beginning; unreliable external source of labels during the querying process; or incompatible mechanisms to evaluate the performance of Active Learner. Inspired by these practical challenges, we present a hybrid query strategy-based AL framework that addresses three practical challenges simultaneously: cold-start, oracle uncertainty and performance evaluation of Active Learner in the absence of ground truth. While a pre-clustering approach is employed to address the cold-start problem, the uncertainty surrounding the expertise of labeler and confidence in the given labels is incorporated to handle oracle uncertainty. The heuristics obtained during the querying process serve as the fundamental premise for accessing the performance of Active Learner. The robustness of the proposed AL framework is evaluated across three different environments and industrial settings. The results demonstrate the capability of the proposed framework to tackle practical challenges during AL implementation in real-world scenarios.
翻译:积极学习(AL)是解决现代机器学习问题的有力工具,标记较少的培训实例。然而,在实际情况下,传统AL方法的实施因内在假设而面临多重挑战。存在若干障碍,例如初始AL算法没有标签;查询过程中标签的外部来源不可靠;或评价积极学习者业绩的机制不兼容。受这些实际挑战的启发,我们提出了一个基于混合查询战略的AL框架,该框架同时解决三个实际挑战:在缺乏地面真相的情况下,活跃学习者冷启动、极不稳定和业绩评价。虽然采用分组前办法解决冷启动问题,但围绕标签专家专门知识的不确定性和对特定标签的信心被纳入处理或消除不确定性。在查询过程中获得的超自然现象是获取积极学习者业绩的基本前提。在三个不同的环境和工业环境中,对拟议的AL框架的稳健性进行了评估。结果表明,拟议的框架有能力在现实世界中实施AL期间应对实际挑战。