具有依赖性等级嵌入装置的分散预测 (Cross Version Defect Prediction with Class Dependency Embeddings)

Software Defect Prediction aims at predicting which software modules are the most probable to contain defects. The idea behind this approach is to save time during the development process by helping find bugs early. Defect Prediction models are based on historical data. Specifically, one can use data collected from past software distributions, or Versions, of the same target application under analysis. Defect Prediction based on past versions is called Cross Version Defect Prediction (CVDP). Traditionally, Static Code Metrics are used to predict defects. In this work, we use the Class Dependency Network (CDN) as another predictor for defects, combined with static code metrics. CDN data contains structural information about the target application being analyzed. Usually, CDN data is analyzed using different handcrafted network measures, like Social Network metrics. Our approach uses network embedding techniques to leverage CDN information without having to build the metrics manually. In order to use the embeddings between versions, we incorporate different embedding alignment techniques. To evaluate our approach, we performed experiments on 24 software release pairs and compared it against several benchmark methods. In these experiments, we analyzed the performance of two different graph embedding techniques, three anchor selection approaches, and two alignment techniques. We also built a meta-model based on two different embeddings and achieved a statistically significant improvement in AUC of 4.7% (p < 0.002) over the baseline method.

翻译：软件失灵预测旨在预测哪些软件模块最有可能包含缺陷。这种方法背后的想法是帮助早期发现错误, 从而在开发过程中节省时间。失灵预测模型以历史数据为基础。具体地说, 可以利用从以往软件发布或版本中收集的数据, 同一目标应用程序正在分析的版本。过去版本的失灵预测方法被称为Cross Voice Deffect Villion(CVDP) 。传统上, 静态代码计量器用于预测缺陷。在这项工作中, 我们使用类依赖网络(CDN)作为缺陷的另一个预测器, 并结合静态代码指标。 CDN 数据包含关于正在分析的目标应用程序的结构信息。通常, CDN 数据可以使用不同的手动网络措施, 如社会网络指标等。我们的方法使用网络嵌入技术来利用CDN信息, 而不必手工构建指标。为了使用不同版本的嵌入, 我们采用了不同的嵌入式校准技术。为了评估我们的方法, 我们用24个软件释放配对它进行了实验, 对照若干基准方法。 CDN 正在分析的目标应用程序正在分析, 我们用两种不同的图表选择了两种不同的图表。两种不同的模型。

相关内容

CDN

关注 4

CDN的全称是Content Delivery Network，即内容分发网络。其基本思路是尽可能避开互联网上有可能影响数据传输速度和稳定性的瓶颈和环节，使内容传输的更快、更稳定。通过在网络各处放置节点服务器所构成的在现有的互联网基础之上的一层智能虚拟网络，CDN系统能够实时地根据网络流量和各节点的连接、负载状况以及到用户的距离和响应时间等综合信息将用户的请求重新导向离用户最近的服务节点上。其目的是使用户可就近取得所需内容，解决 Internet网络拥挤的状况，提高用户访问网站的响应速度。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

76+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日