应用信息理论于软件演化 (Applying information theory to software evolution) - 专知论文

会员服务 ·

0

软件演化 · 信息理论 · 结构性 · 软件 · 不稳定 ·

2023 年 3 月 24 日

Applying information theory to software evolution

翻译：应用信息理论于软件演化

Adriano Torres,Sebastian Baltes,Christoph Treude,Markus Wagner

from arxiv, 8 pages, 6 figures, submitted to the NLBSE2023 workshop

Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our preliminary investigation of outliers shows an unexpected high frequency of events where there is considerable change in the information content of the project, suggesting that such outliers may inform a definition of surprisal.

翻译：虽然信息理论在许多学科中取得了成功，在其应用于软件演化的文献却很有限。我们仍然缺乏旨在利用可用数据和工具来衡量项目信息内容的定义，以此作为其复杂性的代理。在这项工作中，我们探究了两种熵的定义，一种是结构性的，另一种是文本性的，并将其应用于25个开源项目的历史提交进程。我们得出的证据表明它们通常高度相关。我们还观察到它们与其他复杂度度量显示出弱且不稳定的相关性。我们对异常值的初步调查显示出了一个意外的高频事件，在其中项目的信息内容发生了显著变化，这表明这种异常值可能可以提供关于“惊奇度”的定义。

0

相关内容

软件演化

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【康奈尔大学】度量数据粒度，Measuring Dataset Granularity

【康奈尔大学】度量数据粒度，Measuring Dataset Granularity

专知会员服务

13+阅读 · 2019年12月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

信息计量经济学的理论和应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于混合模糊信息的多属性群决策方法及其应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

金融连续时间随机过程的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

基于微分流形理论的虚拟仪器测量不确定度评估方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于行为的SoS体系结构评价研究

国家自然科学基金

1+阅读 · 2012年12月31日

有限注意力配置下的鲁棒动态投资决策与金融传染问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于抽象的软件符号模型检测研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于鲁棒设计的供应链质量控制策略研究

国家自然科学基金

0+阅读 · 2009年12月31日

Investigating the Impact of Direct Punishment on the Emergence of Cooperation in Multi-Agent Reinforcement Learning Systems

Arxiv

0+阅读 · 2023年5月13日

NevIR: Negation in Neural Information Retrieval

Arxiv

0+阅读 · 2023年5月12日

A Logarithmic Decomposition for Information

Arxiv

0+阅读 · 2023年5月12日

On the Fair Comparison of Optimization Algorithms in Different Machines

Arxiv

0+阅读 · 2023年5月12日

Analysis of h-index for research awards

Arxiv

0+阅读 · 2023年5月12日

Structural Complexities of Matching Mechanisms

Arxiv

0+阅读 · 2023年5月11日

A Method to Automate the Discharge Summary Hospital Course for Neurology Patients

Arxiv

0+阅读 · 2023年5月10日

What is mature and what is still emerging in the cryptocurrency market?

Arxiv

0+阅读 · 2023年5月9日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【康奈尔大学】度量数据粒度，Measuring Dataset Granularity

【康奈尔大学】度量数据粒度，Measuring Dataset Granularity

专知会员服务

13+阅读 · 2019年12月27日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Investigating the Impact of Direct Punishment on the Emergence of Cooperation in Multi-Agent Reinforcement Learning Systems

Arxiv

0+阅读 · 2023年5月13日

NevIR: Negation in Neural Information Retrieval

Arxiv

0+阅读 · 2023年5月12日

A Logarithmic Decomposition for Information

Arxiv

0+阅读 · 2023年5月12日

On the Fair Comparison of Optimization Algorithms in Different Machines

Arxiv

0+阅读 · 2023年5月12日

Analysis of h-index for research awards

Arxiv

0+阅读 · 2023年5月12日

Structural Complexities of Matching Mechanisms

Arxiv

0+阅读 · 2023年5月11日

A Method to Automate the Discharge Summary Hospital Course for Neurology Patients

Arxiv

0+阅读 · 2023年5月10日

What is mature and what is still emerging in the cryptocurrency market?

Arxiv

0+阅读 · 2023年5月9日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

相关基金

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

信息计量经济学的理论和应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于Universum学习的降维方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于混合模糊信息的多属性群决策方法及其应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

金融连续时间随机过程的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

基于微分流形理论的虚拟仪器测量不确定度评估方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于行为的SoS体系结构评价研究

国家自然科学基金

1+阅读 · 2012年12月31日

有限注意力配置下的鲁棒动态投资决策与金融传染问题

国家自然科学基金

0+阅读 · 2012年12月31日

基于抽象的软件符号模型检测研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于鲁棒设计的供应链质量控制策略研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员