小文字: 积极学习 Python 文本分类</s> (Small-Text: Active Learning for Text Classification in Python)

We introduce small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features numerous pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid and convenient development of both active learning experiments and applications. With the objective of making various classifiers and query strategies accessible for active learning, small-text integrates several well-known machine learning libraries, namely scikit-learn, PyTorch, and Hugging Face transformers. The latter integrations are optionally installable extensions, so GPUs can be used but are not required. Using this new library, we investigate the performance of the recently published SetFit training paradigm, which we compare to vanilla transformer fine-tuning, finding that it matches the latter in classification accuracy while outperforming it in area under the curve. The library is available under the MIT License at https://github.com/webis-de/small-text, in version 1.3.0 at the time of writing.

翻译：我们引入了小型文本,这是一个易于使用的活跃学习图书馆,它为Python的单一和多标签文本分类提供了基于库库的积极学习,它为Python的单一和多标签文本分类提供了基础。它包含许多预先实施的最新查询战略,包括一些利用 GPU的功能。标准化界面可以将各种分类者、查询战略和停止标准结合起来,便于快速混合和匹配,并有利于快速和方便地发展积极的学习实验和应用。为了让各种分类和查询战略便于积极学习,小文本将若干知名的机器学习图书馆,即Scikit-learn、PyTorrch和Hugging Face变异器融合在一起。后一种整合是可随意安装的扩展,因此GPUs可以使用,但并不需要。我们利用这个新的图书馆,调查最近出版的SetFit培训模式的绩效,我们把它与Vanilla变异器的微调进行比较,发现它与后者的分类准确性相匹配,同时在曲线下表现它。图书馆在https://girmwebbbis版本的Mlishalplishal.</s>

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日