Ethereum smart contracts are programs that run on the Ethereum blockchain, and many smart contract vulnerabilities have been discovered in the past decade. Many security analysis tools have been created to detect such vulnerabilities, but their performance decreases drastically when codes to be analyzed are being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool for vulnerability detection, with robustness against code rewrites in smart contracts. Existing machine-learning-based static analysis tools for vulnerability detection need features, which analysts create manually, as inputs. In contrast, Eth2Vec automatically learns features of vulnerable Ethereum Virtual Machine (EVM) bytecodes with tacit knowledge through a neural network for language processing. Therefore, Eth2Vec can detect vulnerabilities in smart contracts by comparing the code similarity between target EVM bytecodes and the EVM bytecodes it already learned. We conducted experiments with existing open databases, such as Etherscan, and our results show that Eth2Vec outperforms the existing work in terms of well-known metrics, i.e., precision, recall, and F1-score. Moreover, Eth2Vec can detect vulnerabilities even in rewritten codes.
翻译:Eth2Vec 是一个基于机器学习的静态分析工具,用于检测脆弱性,在智能合同中的代码重写。现有的基于机器学习的静态分析工具需要一些脆弱性检测功能,这些功能是分析员手工生成的,作为投入。相比之下,Eth2Vec 自动通过语言处理的神经网络来学习易变性(Etherum Vic) 的特征。因此,Eth2Vec 可以通过对目标EVM 字典和EVM 字典的代码相似性进行比较,从而发现智能合同中的易变性。我们用现有的开放数据库进行实验,比如Etherscan, 分析员手工生成,作为投入。相比之下,Eth2Vec 自动通过语言处理的神经网络,以隐性知识学习Etherum 虚拟机器(EVM) 字典。因此,Eth2Vec ec 能够通过比较目标EVM 字典和EVM 字典之间的代码相似性。我们用现有的开放数据库进行了实验,比如Etherscan,我们的结果显示Eth2Vec 超越了现有的易变码、i.