As the pump-and-dump schemes (P&Ds) proliferate in the cryptocurrency market, it becomes imperative to detect such fraudulent activities in advance, to inform potentially susceptible investors before they become victims. In this paper, we focus on the target coin prediction task, i.e., to predict the pump probability of all coins listed in the target exchange before a pump. We conduct a comprehensive study of the latest P&Ds, investigate 709 events organized in Telegram channels from Jan. 2019 to Jan. 2022, and unearth some abnormal yet interesting patterns of P&Ds. Empirical analysis demonstrates that pumped coins exhibit intra-channel homogeneity and inter-channel heterogeneity, which inspires us to develop a novel sequence-based neural network named SNN. Specifically, SNN encodes each channel's pump history as a sequence representation via a positional attention mechanism, which filters useful information and alleviates the noise introduced when the sequence length is long. We also identify and address the coin-side cold-start problem in a practical setting. Extensive experiments show a lift of 1.6% AUC and 41.0% Hit Ratio@3 brought by our method, making it well-suited for real-world application. As a side contribution, we release the source code of our entire data science pipeline on GitHub, along with the dataset tailored for studying the latest P&Ds.
翻译:随着泵和泵计划(P&Ds)在加密货币市场上扩散,必须事先发现此类欺诈活动,在潜在易受影响的投资者成为受害者之前通知他们。在本文中,我们侧重于目标硬币预测任务,即预测在泵之前目标交换中列出的所有硬币的泵概率。我们全面研究最新的P&D(P&D),调查2019年1月至2022年1月在Telegram频道中组织的709个事件,并发现一些异常但有趣的P&D模式。经验分析表明,抽塑硬币展示了频道内部的同质和气道间异质性,这激励我们开发了新型的基于序列的神经网络SNNN。具体地说,SNNN将每个频道的泵历史编码作为通过定位关注机制的顺序代表,该机制过滤有用的信息,并在序列长度较长时缓解出现的噪音。我们还查明并解决了硬币侧的冷发问题,在实际科学环境中,我们通过对管道进行定制化化的科学研究,通过大规模实验展示了1.6%的Gi-x数据源提升了我们的最新数据源值。