In today's data-driven society, supervised machine learning is rapidly evolving, and the need for labeled data is increasing. However, the process of acquiring labels is often expensive and tedious. For this reason, we developed ALANNO, an open-source annotation system for NLP tasks powered by active learning. We focus on the practical challenges in deploying active learning systems and try to find solutions to make active learning effective in real-world applications. We support the system with a wealth of active learning methods and underlying machine learning models. In addition, we leave open the possibility to add new methods, which makes the platform useful for both high-quality data annotation and research purposes.
翻译:在当今的数据驱动社会,受监督的机器学习正在迅速演变,对标签数据的需求正在增加。然而,获取标签的过程往往昂贵而乏味。为此,我们开发了ALANNO,这是为国家学习计划任务开发的开放源代码说明系统,由积极学习驱动。我们注重在部署积极学习系统方面的实际挑战,并努力寻找解决办法,使积极学习在现实世界应用中行之有效。我们用大量积极学习的方法和基本机器学习模式支持该系统。此外,我们保留了增加新方法的可能性,这使得平台对高质量数据说明和研究目的都有用。