Understanding the variations in trading price (volatility), and its response to exogenous information, is a well-researched topic in finance. In this study, we focus on finding stable and accurate volatility predictors for a relatively new asset class of cryptocurrencies, in particular Bitcoin, using deep learning representations of public social media data obtained from Twitter. For our experiments, we extracted semantic information and user statistics from over 30 million Bitcoin-related tweets, in conjunction with 15-minute frequency price data over a horizon of 144 days. Using this data, we built several deep learning architectures that utilized different combinations of the gathered information. For each model, we conducted ablation studies to assess the influence of different components and feature sets over the prediction accuracy. We found statistical evidences for the hypotheses that: (i) temporal convolutional networks perform significantly better than both classical autoregressive models and other deep learning-based architectures in the literature, and (ii) tweet author meta-information, even detached from the tweet itself, is a better predictor of volatility than the semantic content and tweet volume statistics. We demonstrate how different information sets gathered from social media can be utilized in different architectures and how they affect the prediction results. As an additional contribution, we make our dataset public for future research.
翻译:了解贸易价格(波动)的变化以及其对外部信息的反应,是金融领域一个研究周密的专题。在本研究中,我们的重点是利用从Twitter获得的公共社交媒体数据的深层学习演示,为相对新的加密资产类别,特别是Bitcoin,寻找稳定、准确的波动预测器。我们通过实验,从3 000多万比特币相关推文中提取了语义信息和用户统计数据,并在144天的视野中收集了15分钟的频率价格数据。我们利用这一数据,建立了若干利用所收集信息的不同组合的深层次学习结构。我们为每一种模型进行了模拟研究,以评估不同组成部分和特征对预测准确性的影响。我们为以下假设找到了统计证据:(一) 时间革命网络比典型的自动递减模型和其他基于深层学习的文献结构运行得要好得多;(二) 推文作者的元信息,即使与推文本身脱钩,也比语义内容和推文数量统计本身更能预测不稳定性。我们如何利用不同的数据来预测未来。我们如何利用不同的数据来预测公共媒体进行不同的预测。