As various forms of fraud proliferate on Ethereum, it is imperative to safeguard against these malicious activities to protect susceptible users from being victimized. While current studies solely rely on graph-based fraud detection approaches, it is argued that they may not be well-suited for dealing with highly repetitive, skew-distributed and heterogeneous Ethereum transactions. To address these challenges, we propose BERT4ETH, a universal pre-trained Transformer encoder that serves as an account representation extractor for detecting various fraud behaviors on Ethereum. BERT4ETH features the superior modeling capability of Transformer to capture the dynamic sequential patterns inherent in Ethereum transactions, and addresses the challenges of pre-training a BERT model for Ethereum with three practical and effective strategies, namely repetitiveness reduction, skew alleviation and heterogeneity modeling. Our empirical evaluation demonstrates that BERT4ETH outperforms state-of-the-art methods with significant enhancements in terms of the phishing account detection and de-anonymization tasks. The code for BERT4ETH is available at: https://github.com/git-disl/BERT4ETH.
翻译:随着以太坊上各种欺诈行为的增多,保护易受攻击的用户免受被利用的危险变得十分重要。虽然当前的研究仅依赖于基于图形的欺诈检测方法,但有人认为它们可能不适合处理高度重复、偏斜分布和异构的以太坊交易。为了应对这些挑战,我们提出了BERT4ETH,这是一个通用的预训练Transformer编码器,用作以太坊各种欺诈行为的帐户表示提取器。BERT4ETH具有Transformer的优越建模能力,可捕捉以太坊交易中固有的动态顺序模式,并通过三种实际有效的策略,即重复性减少、偏斜减轻和异构建模,解决为以太坊预训练BERT模型制定的挑战。我们的经验评估证明BERT4ETH在欺诈行为检测和去匿名化任务方面优于现有的最先进方法。BERT4ETH的代码可在以下位置获得:https://github.com/git-disl/BERT4ETH。