The analysis of software requirement specifications (SRS) using Natural Language Processing (NLP) methods has been an important study area in the software engineering field in recent years. Especially thanks to the advances brought by deep learning and transfer learning approaches in NLP, SRS data can be utilized for various learning tasks more easily. In this study, we employ a three-stage domain-adaptive fine-tuning approach for three prediction tasks regarding software requirements, which improve the model robustness on a real distribution shift. The multi-class classification tasks involve predicting the type, priority and severity of the requirement texts specified by the users. We compare our results with strong classification baselines such as word embedding pooling and Sentence BERT, and show that the adaptive fine-tuning leads to performance improvements across the tasks. We find that an adaptively fine-tuned model can be specialized to particular data distribution, which is able to generate accurate results and learns from abundantly available textual data in software engineering task management systems.
翻译:使用自然语言处理方法分析软件要求规格(SRS)是近年来软件工程领域的一个重要研究领域,特别是由于在NLP的深层次学习和转让学习方法带来的进步,因此可以更容易地将SRS数据用于各种学习任务。在这项研究中,我们为软件要求方面的三项预测任务采用了三阶段的域适应性微调方法,这提高了软件要求的模型在实际分配变化上的稳健性。多级分类任务涉及预测用户指定的需求文本的类型、优先性和严重性。我们将我们的结果与强有力的分类基线,例如词嵌入集合和句子BERT进行比较,并表明适应性微调可以改善各项任务的业绩。我们发现,适应性微调模型可以专门用于特定数据分配,能够产生准确的结果,并学习软件工程任务管理系统中大量可用的文字数据。