C++中的命名强制类型转换是否反映了语义？一项大规模的命名强制类型转换标识符研究 (Do Names Echo Semantics? A Large-Scale Study of Identifiers Used in C++'s Named Casts) - 专知论文

会员服务 ·

0

信息理论 · Chromium · 标识符 · 条件熵 · 重用 ·

2023 年 4 月 3 日

Do Names Echo Semantics? A Large-Scale Study of Identifiers Used in C++'s Named Casts

翻译：C++中的命名强制类型转换是否反映了语义？一项大规模的命名强制类型转换标识符研究

Constantin Cezar Petrescu,Sam Smith,Rafail Giavrimis,Santanu Kumar Dash

from arxiv, The manuscript has 27 pages and it contains 4 Figures, 18 Listings and 4 Tables. The preprint has been accepted at Journal of Systems and Software from Elsevier

Developers relax restrictions on a type to reuse methods with other types. While type casts are prevalent, in weakly typed languages such as C++, they are also extremely permissive. Assignments where a source expression is cast into a new type and assigned to a target variable of the new type, can lead to software bugs if performed without care. In this paper, we propose an information-theoretic approach to identify poor implementations of explicit cast operations. Our approach measures accord between the source expression and the target variable using conditional entropy. We collect casts from 34 components of the Chromium project, which collectively account for 27MLOC and random-uniformly sample this dataset to create a manually labelled dataset of 271 casts. Information-theoretic vetting of these 271 casts achieves a peak precision of 81% and a recall of 90%. We additionally present the findings of an in-depth investigation of notable explicit casts, two of which were fixed in recent releases of the Chromium project.

翻译：开发人员放宽对类型的限制以重用其他类型的方法。在类型转换常见的情况下，对于诸如C++这样的弱类型语言，它们也非常宽容。如果在不加注意地情况下执行将源表达式转换为新类型并分配给新类型的目标变量的赋值，可能会导致软件错误。在本文中，我们提出了一种信息理论方法来识别显式转换操作的低效实现。我们的方法使用条件熵来测量源表达式和目标变量之间的一致性。我们收集了Chromium项目的34个组件中的强制类型转换，这些组件共计有27MLOC，并通过随机均匀抽样创建了一个手动标记的数据集，包含271个强制类型转换。对这271个强制类型转换的信息理论判断达到了81%的精度和90%的召回率。我们还提供了对显着的显式转换进行深入调查的结果，其中两个在最近的Chromium项目版本中已经得到修复。

0

相关内容

信息理论

信息理论( Information theory )

《TextCycleGAN 技术报告》

《TextCycleGAN 技术报告》

专知会员服务

33+阅读 · 2023年5月4日

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

【2021新书】《用正确的方式学Python》，456页pdf

【2021新书】《用正确的方式学Python》，456页pdf

专知会员服务

81+阅读 · 2021年6月9日

【干货书】面向计算科学和工程的Python导论，167页pdf

【干货书】面向计算科学和工程的Python导论，167页pdf

专知会员服务

42+阅读 · 2021年4月7日

【2020干货书】Python3基础导论介绍,98页pdf

【2020干货书】Python3基础导论介绍,98页pdf

专知会员服务

103+阅读 · 2020年10月11日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

【干货书】流畅Python，766页pdf，中英文版

【干货书】流畅Python，766页pdf，中英文版

专知会员服务

226+阅读 · 2020年3月22日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

CSDN

0+阅读 · 2022年9月7日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Trop2对CBSCs移植修复梗死心肌的影响及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

E3泛素连接酶CUL7修饰Caspase-8调节乳腺癌细胞生存的研究

国家自然科学基金

0+阅读 · 2013年12月31日

高速磁悬浮电机能量转换反激变换器单周期自适应逆控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向多设备的个人信息管理的研究

国家自然科学基金

1+阅读 · 2012年12月31日

大肠癌中DNA复制蛋白对双微体染色质的复制、损伤和修复的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cyclin G1对肝癌干细胞的调控及其在肝癌复发耐药中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

新癌基因E3连接酶HECTD3表达调节机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

HER2/uPAR通路调控乳腺肿瘤休眠和细胞周期的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

大豆耐铝相关基因的eQTL定位与功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

棉花SSR标记和Unigenes的大规模开发以及资源整合与数据库建设

国家自然科学基金

0+阅读 · 2009年12月31日

Mixture of Experts with Uncertainty Voting for Imbalanced Deep Regression Problems

Arxiv

0+阅读 · 2023年5月24日

Modeling Complex Object Changes in Satellite Image Time-Series: Approach based on CSP and Spatiotemporal Graph

Arxiv

0+阅读 · 2023年5月24日

Using the Uniqueness of Global Identifiers to Determine the Provenance of Python Software Source Code

Arxiv

0+阅读 · 2023年5月24日

Towards Unsupervised Recognition of Semantic Differences in Related Documents

Arxiv

0+阅读 · 2023年5月22日

Sparse change detection in high-dimensional linear regression

Arxiv

0+阅读 · 2023年5月22日

Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation

Arxiv

0+阅读 · 2023年5月20日

You Can Have Your Cake and Redistrict It Too

Arxiv

0+阅读 · 2023年5月20日

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems

Arxiv

0+阅读 · 2023年5月19日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

相关VIP内容

《TextCycleGAN 技术报告》

《TextCycleGAN 技术报告》

专知会员服务

33+阅读 · 2023年5月4日

【2022新书】Python数据科学导论，309页pdf

【2022新书】Python数据科学导论，309页pdf

专知会员服务

82+阅读 · 2022年8月6日

【2021新书】《用正确的方式学Python》，456页pdf

【2021新书】《用正确的方式学Python》，456页pdf

专知会员服务

81+阅读 · 2021年6月9日

【干货书】面向计算科学和工程的Python导论，167页pdf

【干货书】面向计算科学和工程的Python导论，167页pdf

专知会员服务

42+阅读 · 2021年4月7日

【2020干货书】Python3基础导论介绍,98页pdf

【2020干货书】Python3基础导论介绍,98页pdf

专知会员服务

103+阅读 · 2020年10月11日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

【干货书】流畅Python，766页pdf，中英文版

【干货书】流畅Python，766页pdf，中英文版

专知会员服务

226+阅读 · 2020年3月22日

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

【Python最佳实践、技巧与提示30则】《30 Python Best Practices, Tips, And Tricks》by Erik-Jan van Baaren

专知会员服务

35+阅读 · 2020年1月6日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

用 20+ 行 JavaScript 代码，短暂“变身” iOS 程序员！

CSDN

0+阅读 · 2022年9月7日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Mixture of Experts with Uncertainty Voting for Imbalanced Deep Regression Problems

Arxiv

0+阅读 · 2023年5月24日

Modeling Complex Object Changes in Satellite Image Time-Series: Approach based on CSP and Spatiotemporal Graph

Arxiv

0+阅读 · 2023年5月24日

Using the Uniqueness of Global Identifiers to Determine the Provenance of Python Software Source Code

Arxiv

0+阅读 · 2023年5月24日

Towards Unsupervised Recognition of Semantic Differences in Related Documents

Arxiv

0+阅读 · 2023年5月22日

Sparse change detection in high-dimensional linear regression

Arxiv

0+阅读 · 2023年5月22日

Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation

Arxiv

0+阅读 · 2023年5月20日

You Can Have Your Cake and Redistrict It Too

Arxiv

0+阅读 · 2023年5月20日

TransPimLib: A Library for Efficient Transcendental Functions on Processing-in-Memory Systems

Arxiv

0+阅读 · 2023年5月19日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

相关基金

Trop2对CBSCs移植修复梗死心肌的影响及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

E3泛素连接酶CUL7修饰Caspase-8调节乳腺癌细胞生存的研究

国家自然科学基金

0+阅读 · 2013年12月31日

高速磁悬浮电机能量转换反激变换器单周期自适应逆控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向多设备的个人信息管理的研究

国家自然科学基金

1+阅读 · 2012年12月31日

大肠癌中DNA复制蛋白对双微体染色质的复制、损伤和修复的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Cyclin G1对肝癌干细胞的调控及其在肝癌复发耐药中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

新癌基因E3连接酶HECTD3表达调节机制的研究

国家自然科学基金

1+阅读 · 2012年12月31日

HER2/uPAR通路调控乳腺肿瘤休眠和细胞周期的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

大豆耐铝相关基因的eQTL定位与功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

棉花SSR标记和Unigenes的大规模开发以及资源整合与数据库建设

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员