We summarized both common and novel predictive models used for stock price prediction and combined them with technical indices, fundamental characteristics and text-based sentiment data to predict S&P stock prices. A 66.18% accuracy in S&P 500 index directional prediction and 62.09% accuracy in individual stock directional prediction was achieved by combining different machine learning models such as Random Forest and LSTM together into state-of-the-art ensemble models. The data we use contains weekly historical prices, finance reports, and text information from news items associated with 518 different common stocks issued by current and former S&P 500 large-cap companies, from January 1, 2000 to December 31, 2019. Our study's innovation includes utilizing deep language models to categorize and infer financial news item sentiment; fusing different models containing different combinations of variables and stocks to jointly make predictions; and overcoming the insufficient data problem for machine learning models in time series by using data across different stocks.
翻译:我们总结了用于股票价格预测的通用和新型预测模型,并将其与技术指数、基本特征和基于文字的情绪数据相结合,以预测股票价格。S & P 500指数方向预测的准确率为66.18%,单项股票方向预测的准确率为62.09%,方法是将随机森林和LSTM等不同机器学习模型合并为最先进的组合模型。我们使用的数据包含每周历史价格、财务报告和与2000年1月1日至2019年12月31日当前和以前的S & P 500大资本公司发行的518种不同共同股票相关的新闻项目文本信息。我们的研究创新包括利用深语言模型对金融新闻项目进行分类和推断;利用包含不同变量和库存的不同组合的不同模型联合作出预测;通过使用不同库存的数据克服时间序列中机器学习模型的数据不足的问题。