Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to classify these comments are proposed. In this work, we address this need by proposing, STACC, a set of SentenceTransformers-based binary classifiers. These lightweight classifiers are trained and tested on the NLBSE Code Comment Classification tool competition dataset, and surpass the baseline by a significant margin, achieving an average F1 score of 0.74 against the baseline of 0.31, which is an improvement of 139%. A replication package, as well as the models themselves, are publicly available.
翻译:代码评论是软件手工艺品信息的关键资源。根据使用情况,只有某些类型的评论是有用的。因此,提出了对这些评论进行分类的自动办法。在这项工作中,我们提出一套基于判决的二进制分类方法,即STACC,即一套基于判决的二进制分类方法。这些轻量级分类员在《无法律约束力规则》评论分类工具竞争数据集中接受培训和测试,并大大超过基线,在0.31基线(即改进了139%)中平均达到0.74的F1分。复制软件包和模型本身都公开提供。</s>