Being able to predict stock prices might be the unspoken wish of stock investors. Although stock prices are complicated to predict, there are many theories about what affects their movements, including interest rates, news and social media. With the help of Machine Learning, complex patterns in data can be identified beyond the human intellect. In this thesis, a Machine Learning model for time series forecasting is created and tested to predict stock prices. The model is based on a neural network with several layers of LSTM and fully connected layers. It is trained with historical stock values, technical indicators and Twitter attribute information retrieved, extracted and calculated from posts on the social media platform Twitter. These attributes are sentiment score, favourites, followers, retweets and if an account is verified. To collect data from Twitter, Twitter's API is used. Sentiment analysis is conducted with VADER. The results show that by adding more Twitter attributes, the MSE between the predicted prices and the actual prices improved by 3%. With technical analysis taken into account, MSE decreases from 0.1617 to 0.1437, which is an improvement of around 11%. The restrictions of this study include that the selected stock has to be publicly listed on the stock market and popular on Twitter and among individual investors. Besides, the stock markets' opening hours differ from Twitter, which constantly available. It may therefore introduce noises in the model.
翻译:能够预测股票价格可能是股票投资者的未知愿望。 虽然股票价格是复杂而难以预测的,但有许多关于影响其流动的理论,包括利率、新闻和社交媒体。 在机器学习的帮助下,可以在人类智慧之外确定复杂的数据模式。在这个论文中,为预测股票价格创建并测试了时间序列预测的机器学习模式。该模型基于一个神经网络,拥有几层LSTM和完全相连的层次。该模型经过了历史股票价值、技术指标和从社交媒体平台Twitter上获取、提取和计算过的Twitter属性信息的培训。这些属性是情绪评分、爱好者、追随者、retweetets和账户核实。为了收集Twitter的数据,使用了Twitter的API。用VADER进行了感化分析。结果显示,通过增加更多的Twitter属性,预测价格与实际价格提高3%的模型之间,MSEE值从0.1617下降到0.14337,这是大约11 %的改进。这些属性是情绪、追随者、reweets和账户验证。为了收集的数据,Twitter用户的局限性,因此,在公开市场和公开市场上可能出现股价。在公开市场上,因此,在公开市场和公开市场上将股票上推出。