Active learning is the iterative construction of a classification model through targeted labeling, enabling significant labeling cost savings. As most research on active learning has been carried out before transformer-based language models ("transformers") became popular, despite its practical importance, comparably few papers have investigated how transformers can be combined with active learning to date. This can be attributed to the fact that using state-of-the-art query strategies for transformers induces a prohibitive runtime overhead, which effectively cancels out, or even outweighs aforementioned cost savings. In this paper, we revisit uncertainty-based query strategies, which had been largely outperformed before, but are particularly suited in the context of fine-tuning transformers. In an extensive evaluation on five widely used text classification benchmarks, we show that considerable improvements of up to 14.4 percentage points in area under the learning curve are achieved, as well as a final accuracy close to the state of the art for all but one benchmark, using only between 0.4% and 15% of the training data.
翻译:积极学习是通过有针对性标签的方式反复构建分类模式,从而能够节省大量成本。大多数关于积极学习的研究都是在以变压器为基础的语言模型(“变换器”)变得流行之前进行的,尽管其具有实际重要性,但相当少的文件调查了变压器如何与迄今为止的积极学习相结合。这可以归因于对变压器采用最先进的查询策略,导致一个令人望而却步的运行时间过快的间接费用,这实际上取消,甚至超过上述节省的成本。在本文中,我们重新审视基于不确定性的查询策略,这些策略以前基本上表现得不尽人意,但特别适合微调变压器。在对五种广泛使用的文本分类基准进行的广泛评估中,我们显示在学习曲线下的地区取得了高达14.4个百分点的显著改进,以及除了一个基准之外,所有领域最后的精确度接近最新水平,只有1个基准,只使用0.4%至15%的培训数据。