Recent advances in natural language processing (NLP) have led to strong text classification models for many tasks. However, still often thousands of examples are needed to train models with good quality. This makes it challenging to quickly develop and deploy new models for real world problems and business needs. Few-shot learning and active learning are two lines of research, aimed at tackling this problem. In this work, we combine both lines into FASL, a platform that allows training text classification models using an iterative and fast process. We investigate which active learning methods work best in our few-shot setup. Additionally, we develop a model to predict when to stop annotating. This is relevant as in a few-shot setup we do not have access to a large validation set.
翻译:自然语言处理(NLP)的近期进展导致许多任务都形成了强有力的文本分类模式,然而,仍然需要数千个实例来培训高质量的模型,这就使得迅速开发和部署用于现实世界问题和商业需求的新模型具有挑战性。少见的学习和积极学习是旨在解决这一问题的两条研究线。在这项工作中,我们将这两条线结合到FASL(FASL)(FASL)(FASL)(FASL)(FASL)(FASL)(FASL))(FASL)(FASL)(FASL)(F)(FASL)(FAS)(FA)(FASL)(FA)(FASL)(FA)(FSL)(FASL)(FA)(FSL)中,这是一个平台,这个平台可以使用一个迭接和快速的过程,我们调查哪些积极的学习方法在我们的短片设置中最有效。此外,我们开发了一个模型来预测何时停止批注解。在少数情况下是相关的,这与我们没有机会获得大校准。