Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. We first denoise the input code sequences based on the token types pre-defined by the compilers to filter those tokens whose attention scores are too small. After that, we define a new metric CAT-score to measure the commonality between the token-level attention scores generated in CodePTMs and the pair-wise distances between corresponding AST nodes. The higher the CAT-score, the stronger the ability of CodePTMs to capture code structure. We conduct extensive experiments to integrate CAT-probing with representative CodePTMs for different programming languages. Experimental results show the effectiveness of CAT-probing in CodePTM interpretation. Our codes and data are publicly available at https://github.com/nchen909/CodeAttention.
翻译:最近,经过事先培训的代码模型(CodePTMs)在代码情报方面取得了显著的成功。为了解释这些模型,已经采用了一些测试方法。但是,这些方法没有考虑到代码的内在特征。在本文件中,为了解决这个问题,我们建议一种新型的检测方法,用定量的方法来解释代码代码模型是如何使用代码结构的。我们首先根据编译者预先界定的象征类型,淡化输入代码序列,以过滤那些关注分数太小的符号。随后,我们定义了一种新的指标CAT核心,以测量代码和代码模型生成的象征性关注分数和相应的AST节点之间的对对称距离之间的共性。CAT核心越高,代码模型捕捉代码结构的能力就越强。我们进行了广泛的实验,以将CAT-Probing与具有代表性的不同编程语言的代码代码PTMs结合起来。实验结果显示CAT-Probing在代码PTM解释中的有效性。我们的代码和数据公布在https://github.com/nchen909/CodeAstening上。