Over the last few years, machine learning based methods have been applied to extract information from news flow in the financial domain. However, this information has mostly been in the form of the financial sentiments contained in the news headlines, primarily for the stock prices. In our current work, we propose that various other dimensions of information can be extracted from news headlines, which will be of interest to investors, policy-makers and other practitioners. We propose a framework that extracts information such as past movements and expected directionality in prices, asset comparison and other general information that the news is referring to. We apply this framework to the commodity "Gold" and train the machine learning models using a dataset of 11,412 human-annotated news headlines (released with this study), collected from the period 2000-2019. We experiment to validate the causal effect of news flow on gold prices and observe that the information produced from our framework significantly impacts the future gold price.
翻译:过去几年来,以机器学习为基础的方法被用于从金融领域新闻流中提取信息,然而,这一信息大多以新闻头条新闻所载金融情绪的形式出现,主要是股票价格;在目前的工作中,我们建议从新闻头条新闻中提取信息的其他各个方面,投资者、决策者和其他从业者对此感兴趣;我们提议了一个框架,提取信息,例如过去的价格变动和预期方向、资产比较以及新闻提到的其他一般信息;我们将这一框架应用于商品“黄金”,并利用2000至2019年期间收集的11 412个人类附加说明新闻头条(与本研究一起发布)数据集培训机器学习模型;我们试验验证新闻流动对黄金价格的因果关系,并观察我们从框架获得的信息对未来黄金价格产生重大影响。