CatIs: 使用变换器对问题报告进行分类的智能工具 (CatIss: An Intelligent Tool for Categorizing Issues Reports using Transformers) - 专知论文

会员服务 ·

0

微平均 · MoDELS · 变换 · INFORMS · Performer ·

2022 年 3 月 31 日

CatIss: An Intelligent Tool for Categorizing Issues Reports using Transformers

翻译：CatIs: 使用变换器对问题报告进行分类的智能工具

from arxiv, To appear in the Proceedings of the 1sth International Workshop on Natural Language-based Software Engineering (NLBSE), co-located with ICSE, 2022

Users use Issue Tracking Systems to keep track and manage issue reports in their repositories. An issue is a rich source of software information that contains different reports including a problem, a request for new features, or merely a question about the software product. As the number of these issues increases, it becomes harder to manage them manually. Thus, automatic approaches are proposed to help facilitate the management of issue reports. This paper describes CatIss, an automatic CATegorizer of ISSue reports which is built upon the Transformer-based pre-trained RoBERTa model. CatIss classifies issue reports into three main categories of Bug reports, Enhancement/feature requests, and Questions. First, the datasets provided for the NLBSE tool competition are cleaned and preprocessed. Then, the pre-trained RoBERTa model is fine-tuned on the preprocessed dataset. Evaluating CatIss on about 80 thousand issue reports from GitHub, indicates that it performs very well surpassing the competition baseline, TicketTagger, and achieving 87.2% F1-score (micro average). Additionally, as CatIss is trained on a wide set of repositories, it is a generic prediction model, hence applicable for any unseen software project or projects with little historical data. Scripts for cleaning the datasets, training CatIss, and evaluating the model are publicly available.

翻译：用户使用“ 问题跟踪系统” 来跟踪和管理其存储库中的问题报告。问题是一个丰富的软件信息来源, 包含不同的报告, 包括问题、请求新功能, 或只是软件产品问题。随着这些问题的数量增加, 手工管理这些问题变得更加困难。因此, 提议自动方法来帮助管理问题报告。本文描述了基于以变异器为基础的预先培训的RobreTagle模型的SISUE报告的自动CATEGIATISATIS。 CatIs将发行报告分为三个主要类别, 包括错误报告、增强/ 功能请求和问题。首先, 为BLBSE工具竞赛提供的数据集被清理和预先处理。然后, 预先培训的RoBERTA模型将更精确地调整到预处理问题报告。对来自GitHub的大约80 000份问题报告进行评估的CatIs, 显示它的表现非常出色地超过了竞争基线, 滴Tagger, 以及达到87. 2 % F1- 核心( 微型平均数 ) 。此外, CatIls 被训练过一个可公开使用的历史数据储存模型, 模型。

0

相关内容

微平均

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

泛素交联酶UbcH7在DNA损伤应答过程中的功能和机制分析

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

熔石英损伤增长的复合波长效应研究

国家自然科学基金

0+阅读 · 2015年12月31日

流程监控与评估中多元数据整合研究

国家自然科学基金

1+阅读 · 2014年12月31日

移动机会网络的传输潜力分析与研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于光纤光栅的旋转机械扭振传感机理和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于fMRI-ICA方法的海员出海前后脑功能连通性检测及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多光谱成像技术的稻飞虱自动识别与危害评估研究

国家自然科学基金

0+阅读 · 2009年12月31日

涂层显微结构与表面纹理织构耦合的低摩擦效应及其机理

国家自然科学基金

0+阅读 · 2009年12月31日

冻融作用引起的黄土边坡剥落机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

BugListener: Identifying and Synthesizing Bug Reports from Collaborative Live Chats

Arxiv

0+阅读 · 2022年4月20日

Towards General Purpose Vision Systems

Arxiv

0+阅读 · 2022年4月19日

Automated Application Processing

Arxiv

1+阅读 · 2022年4月19日

How are Software Repositories Mined? A Systematic Literature Review of Workflows, Methodologies, Reproducibility, and Tools

Arxiv

0+阅读 · 2022年4月17日

Approaching sales forecasting using recurrent neural networks and transformers

Arxiv

0+阅读 · 2022年4月16日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Is Surprisal in Issue Trackers Actionable?

Arxiv

0+阅读 · 2022年4月15日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

人工智能赋能自主武器与人类控制第一部分：人类控制与机器学习的设计和开发 | 46页

军事指挥控制系统：2025年5种用途

人工智能赋能自主武器与人类控制第二部分：人类控制与军事指挥官 | 38页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

BugListener: Identifying and Synthesizing Bug Reports from Collaborative Live Chats

Arxiv

0+阅读 · 2022年4月20日

Towards General Purpose Vision Systems

Arxiv

0+阅读 · 2022年4月19日

Automated Application Processing

Arxiv

1+阅读 · 2022年4月19日

How are Software Repositories Mined? A Systematic Literature Review of Workflows, Methodologies, Reproducibility, and Tools

Arxiv

0+阅读 · 2022年4月17日

Approaching sales forecasting using recurrent neural networks and transformers

Arxiv

0+阅读 · 2022年4月16日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Is Surprisal in Issue Trackers Actionable?

Arxiv

0+阅读 · 2022年4月15日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

相关基金

泛素交联酶UbcH7在DNA损伤应答过程中的功能和机制分析

国家自然科学基金

0+阅读 · 2015年12月31日

F-actin结合蛋白在维甲酸诱导的舌肌发育不良中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

熔石英损伤增长的复合波长效应研究

国家自然科学基金

0+阅读 · 2015年12月31日

流程监控与评估中多元数据整合研究

国家自然科学基金

1+阅读 · 2014年12月31日

移动机会网络的传输潜力分析与研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于光纤光栅的旋转机械扭振传感机理和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于fMRI-ICA方法的海员出海前后脑功能连通性检测及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于多光谱成像技术的稻飞虱自动识别与危害评估研究

国家自然科学基金

0+阅读 · 2009年12月31日

涂层显微结构与表面纹理织构耦合的低摩擦效应及其机理

国家自然科学基金

0+阅读 · 2009年12月31日

冻融作用引起的黄土边坡剥落机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员