Finding the same or similar code snippets in source code is one of fundamental activities in software maintenance. Text-based pattern matching tools such as grep is frequently used for such purpose, but making proper queries for the expected result is not easy. Code clone detectors could be used but their features and result are generally excessive. In this paper, we propose Code Clone matching (CC matching for short) that employs a combination of token-based clone detection and meta-patterns enhanced with meta-tokens. The user simply gives a query code snippet possibly with a few meta-tokens and then gets the resulting snippets, forming type 1, 2, or 3 code clone pairs between the query and result. By using a code snippet with meta-tokens as the query, the resulting matches are well controlled by the users. CC matching has been implemented as a practical and efficient tool named ccgrep, with grep-like user interface. The evaluation shows that ccgrep~ is a very effective to find various kinds of code snippets.
翻译:在源代码中查找相同或类似的代码片断是软件维护的基本活动之一。 基于文本的模式匹配工具, 如 grep 经常用于此目的, 但对预期结果进行适当查询并不容易。 代码克隆检测器可以使用, 但其特性和结果一般是过多的 。 在本文中, 我们提议代码克隆匹配( CC 匹配短片), 使用基于符号的克隆检测和元式强化的元体组合。 用户只需给出一个查询代码片断, 可能使用几个元吨, 然后在查询和结果之间获取相应的代码片断, 形成第1、2 或 3 个代码克隆组。 通过使用带有元件的代码片断, 由此产生的匹配由用户很好地控制。 CC 匹配已被实施为实用有效的工具, 名为 ccgrep 和 grep 类似用户界面。 评价显示, ccgrep~ 是找到各种代码片断的非常有效的工具 。