描述 Java 代码中原始变量知识的特性Comment (Characterising the Knowledge about Primitive Variables in Java Code Comments) - 专知论文

会员服务 ·

0

F分数 · 可辨认的 · Performer · Taxonomy · 查全率/召回率 ·

2021 年 3 月 23 日

Characterising the Knowledge about Primitive Variables in Java Code Comments

翻译：描述 Java 代码中原始变量知识的特性Comment

Mahfouth Alghamdi,Shinpei Hayashi,Takashi Kobayashi,Christoph Treude

Primitive types are fundamental components available in any programming language, which serve as the building blocks of data manipulation. Understanding the role of these types in source code is essential to write software. Little work has been conducted on how often these variables are documented in code comments and what types of knowledge the comments provide about variables of primitive types. In this paper, we present an approach for detecting primitive variables and their description in comments using lexical matching and advanced matching. We evaluate our approaches by comparing the lexical and advanced matching performance in terms of recall, precision, and F-score, against 600 manually annotated variables from a sample of GitHub projects. The performance of our advanced approach based on F-score was superior compared to lexical matching, 0.986 and 0.942, respectively. We then create a taxonomy of the types of knowledge contained in these comments about variables of primitive types. Our study showed that developers usually documented the variables' identifiers of a numeric data type with their purpose~(69.16%) and concept~(72.75%) more than the variables' identifiers of type String which were less documented with purpose~(61.14%) and concept~(55.46%). Our findings characterise the current state of the practice of documenting primitive variables and point at areas that are often not well documented, such as the meaning of boolean variables or the purpose of fields and local variables.

翻译：原始类型是任何编程语言的基本组成部分, 它们是数据操纵的基石。了解这些类型在源代码中的角色是写软件的关键。我们很少研究这些变量在代码评论中记录的次数以及这些变量对原始类型变量提供的知识类型。在本文中, 我们提出一种方法, 使用词汇匹配和高级匹配来检测原始变量及其在评论中描述。我们通过比较在回溯、精确度和 F- Score 方面的词汇和高级匹配性能来评估我们的方法, 比较600个来自 GitHub 项目样本的人工附加说明性变量。我们基于 F- Score 的先进方法的性能优于分别在代码匹配、 0. 986 和 0. 942 中记录这些变量的频率。我们的研究显示, 开发者通常用数字数据类型中的变量标识来记录它们的目的~ (69.16 %) 和概念~ (72. 75 %) 多于类型字符串的变量标识, 这些变量的目的不那么目的~ (61. 14 %) 和原始变量的特性区域, 通常记录为: 55 的当前定义的特性和正变数区域。

0

相关内容

F分数

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

275+阅读 · 2020年2月13日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知

12+阅读 · 2020年2月13日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LeetCode的C++ 11/Python3 题解及解释

LeetCode的C++ 11/Python3 题解及解释

专知

16+阅读 · 2019年4月13日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

深度学习自然语言处理阅读清单

深度学习自然语言处理阅读清单

专知

23+阅读 · 2019年1月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Python 杠上 Java、C/C++，赢面有几成？

Python 杠上 Java、C/C++，赢面有几成？

CSDN

6+阅读 · 2018年4月12日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Lexicographic Enumeration of Set Partitions

Arxiv

0+阅读 · 2021年5月16日

What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk

Arxiv

0+阅读 · 2021年5月13日

Consistency of $p$-norm based tests in high dimensions: characterization, monotonicity, domination

Arxiv

0+阅读 · 2021年5月13日

A new characterization of discrete decomposable models

Arxiv

0+阅读 · 2021年5月12日

Low-Dimensional Hyperbolic Knowledge Graph Embeddings

Arxiv

14+阅读 · 2020年5月1日

Reasoning on Knowledge Graphs with Debate Dynamics

Reasoning on Knowledge Graphs with Debate Dynamics

Arxiv

14+阅读 · 2020年1月2日

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

Arxiv

10+阅读 · 2018年12月11日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

Arxiv

5+阅读 · 2018年7月23日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

查全率/召回率

相关VIP内容

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

108+阅读 · 2020年5月1日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

【SIGMOD2020】知识图谱补全方法的现实再评价，Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study

专知会员服务

33+阅读 · 2020年3月23日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

275+阅读 · 2020年2月13日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知

12+阅读 · 2020年2月13日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LeetCode的C++ 11/Python3 题解及解释

LeetCode的C++ 11/Python3 题解及解释

专知

16+阅读 · 2019年4月13日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

深度学习自然语言处理阅读清单

深度学习自然语言处理阅读清单

专知

23+阅读 · 2019年1月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Python 杠上 Java、C/C++，赢面有几成？

Python 杠上 Java、C/C++，赢面有几成？

CSDN

6+阅读 · 2018年4月12日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

相关论文

Lexicographic Enumeration of Set Partitions

Arxiv

0+阅读 · 2021年5月16日

What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk

Arxiv

0+阅读 · 2021年5月13日

Consistency of $p$-norm based tests in high dimensions: characterization, monotonicity, domination

Arxiv

0+阅读 · 2021年5月13日

A new characterization of discrete decomposable models

Arxiv

0+阅读 · 2021年5月12日

Low-Dimensional Hyperbolic Knowledge Graph Embeddings

Arxiv

14+阅读 · 2020年5月1日

Reasoning on Knowledge Graphs with Debate Dynamics

Reasoning on Knowledge Graphs with Debate Dynamics

Arxiv

14+阅读 · 2020年1月2日

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

Arxiv

10+阅读 · 2018年12月11日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

Arxiv

5+阅读 · 2018年7月23日

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Arxiv

6+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员