We present small-text, a simple modular active learning library, which offers pool-based active learning for text classification in Python. It comes with various pre-implemented state-of-the-art query strategies, including some which can leverage the GPU. Clearly defined interfaces allow to combine a multitude of such query strategies with different classifiers, thereby facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers accessible in a consistent way, it integrates several well-known machine learning libraries, namely, scikit-learn, PyTorch, and huggingface transformers -- for which the latter integrations are available as optionally installable extensions. The library is available under the MIT License at https://github.com/webis-de/small-text.
翻译:我们展示了小型文本,这是一个简单的模块化主动学习图书馆,它为Python的文本分类提供了基于集合的积极学习;它包含各种预先实施的最新查询策略,包括一些能够利用GPU的策略。明确界定的界面可以将许多这样的查询策略与不同的分类者结合起来,从而方便快速的混合和匹配,并能够迅速发展积极的学习实验和应用。为使各种分类者能够以一致的方式进入,它整合了几个知名的机器学习图书馆,即Scikit-learn、PyTorch和拥抱式变异器 -- -- 后者的整合作为可选安装的扩展件可供使用。图书馆可在https://github.com/webis-de/ small-text的MIT许可证下查阅。