Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precisio} to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.
翻译:预先培训的变压器模型是目前天然语言模型处理的最新技术。 SEBERT是这样一个模型,它以BERT结构为基础开发,但从零开始接受软件工程数据培训。我们对这一模型进行了微调,以适应问题类型预测任务对LBSE的挑战。我们的模型主导了所有三种问题类型在召回和预估中的基准快速文本,以达到85.7%的F1核心,比基准增加了4.1%。