iTiger: 自动发布标题生成工具 (iTiger: An Automatic Issue Title Generation Tool) - 专知论文

会员服务 ·

0

BART · 讲稿 · Attention · 知识 (knowledge) · Performer ·

2022 年 8 月 31 日

iTiger: An Automatic Issue Title Generation Tool

翻译：iTiger: 自动发布标题生成工具

Ting Zhang,Ivana Clairine Irsan,Ferdian Thung,DongGyun Han,David Lo,Lingxiao Jiang

from arxiv, Accepted by the ESEC/FSE 2022 Demonstrations Track

In both commercial and open-source software, bug reports or issues are used to track bugs or feature requests. However, the quality of issues can differ a lot. Prior research has found that bug reports with good quality tend to gain more attention than the ones with poor quality. As an essential component of an issue, title quality is an important aspect of issue quality. Moreover, issues are usually presented in a list view, where only the issue title and some metadata are present. In this case, a concise and accurate title is crucial for readers to grasp the general concept of the issue and facilitate the issue triaging. Previous work formulated the issue title generation task as a one-sentence summarization task. A sequence-to-sequence model was employed to solve this task. However, it requires a large amount of domain-specific training data to attain good performance in issue title generation. Recently, pre-trained models, which learned knowledge from large-scale general corpora, have shown much success in software engineering tasks. In this work, we make the first attempt to fine-tune BART, which has been pre-trained using English corpora, to generate issue titles. We implemented the fine-tuned BART as a web tool named iTiger, which can suggest an issue title based on the issue description. iTiger is fine-tuned on 267,094 GitHub issues. We compared iTiger with the state-of-the-art method, i.e., iTAPE, on 33,438 issues. The automatic evaluation shows that iTiger outperforms iTAPE by 29.7%, 50.8%, and 34.1%, in terms of ROUGE-1, ROUGE-2, ROUGE-L F1-scores. The manual evaluation also demonstrates the titles generated by BART are preferred by evaluators over the titles generated by iTAPE in 72.7% of cases. Besides, the evaluators deem our tool as useful and easy-to-use. They are also interested to use our tool in the future.

翻译：在商业和开放源码软件中,错误报告或问题被用来跟踪错误或特性请求。然而, 问题的质量可能差异很大。先前的研究发现, 质量好的错误报告会比质量差的错误报告得到更多的注意。但是, 标题质量是问题的一个重要部分。此外, 通常在列表视图中出现问题, 只有问题标题和一些元数据存在。在此情况下, 简洁和准确的标题对于读者了解问题的一般概念和促进问题三角至关重要。先前的工作将问题标题生成任务设计成一则一言词和合成任务。使用一个序列到序列的模型来解决这个问题。然而, 它需要大量特定域的培训数据才能在发行标题生成中取得良好业绩。最近, 接受过训练的模型在大型的普通星体中学习了72项知识, 显示在软件工程任务中非常成功。我们第一次尝试通过微调 BART, 未来, 已经用英语的 ROGOE 进行精细的 ROT, 将 iHR 升级到。

0

相关内容

BART

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

天然杜仲橡胶/可反应性二氧化硅复合材料的原位合成及新型形状记忆材料的制备

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4活化TAP63a诱导细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PTPMeg2调控STAT3活性的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

汉族和维吾尔族遗传性乳腺癌BRCA基因检测及临床相关研究

国家自然科学基金

0+阅读 · 2009年12月31日

超声造影剂微泡靶向介导Her2-siRNA治疗乳腺癌的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning

Arxiv

0+阅读 · 2022年10月18日

Adversarial and Safely Scaled Question Generation

Arxiv

0+阅读 · 2022年10月17日

Practical Benefits of Feature Feedback Under Distribution Shift

Arxiv

0+阅读 · 2022年10月17日

Biologically Plausible Learning using GAIT-prop Scales to ImageNet

Arxiv

0+阅读 · 2022年10月16日

Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey

Arxiv

0+阅读 · 2022年10月14日

MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents Recommendation

Arxiv

0+阅读 · 2022年10月14日

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Arxiv

0+阅读 · 2022年10月13日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning

Arxiv

0+阅读 · 2022年10月18日

Adversarial and Safely Scaled Question Generation

Arxiv

0+阅读 · 2022年10月17日

Practical Benefits of Feature Feedback Under Distribution Shift

Arxiv

0+阅读 · 2022年10月17日

Biologically Plausible Learning using GAIT-prop Scales to ImageNet

Arxiv

0+阅读 · 2022年10月16日

Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey

Arxiv

0+阅读 · 2022年10月14日

MV-HAN: A Hybrid Attentive Networks based Multi-View Learning Model for Large-scale Contents Recommendation

Arxiv

0+阅读 · 2022年10月14日

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Arxiv

0+阅读 · 2022年10月13日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

天然杜仲橡胶/可反应性二氧化硅复合材料的原位合成及新型形状记忆材料的制备

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4活化TAP63a诱导细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PTPMeg2调控STAT3活性的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

去酰基化ghrelin改善脂肪组织炎症所致胰岛素抵抗的机制- - 调节性T细胞的作用

国家自然科学基金

0+阅读 · 2011年12月31日

汉族和维吾尔族遗传性乳腺癌BRCA基因检测及临床相关研究

国家自然科学基金

0+阅读 · 2009年12月31日

超声造影剂微泡靶向介导Her2-siRNA治疗乳腺癌的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员