Quantifying the similarity between a group of companies has proven to be useful for several purposes, including company benchmarking, fraud detection, and searching for investment opportunities. This exercise can be done using a variety of data sources, such as company activity data and financial data. However, ledger account data is widely available and is standardized to a large extent. Such ledger accounts within a financial statement can be represented by means of a tree, i.e. a special type of graph, representing both the values of the ledger accounts and the relationships between them. Given their broad availability and rich information content, financial statements form a prime data source based on which company similarities or distances could be computed. In this paper, we present a graph distance metric that enables one to compute the similarity between the financial statements of two companies. We conduct a comprehensive experimental study using real-world financial data to demonstrate the usefulness of our proposed distance metric. The experimental results show promising results on a number of use cases. This method may be useful for investors looking for investment opportunities, government officials attempting to identify fraudulent companies, and accountants looking to benchmark a group of companies based on their financial statements.
翻译:事实证明,量化一组公司之间的相似性对于若干目的,包括公司基准、欺诈检测和寻找投资机会等都是有益的,这项工作可以使用各种数据来源进行,例如公司活动数据和金融数据;然而,分类账账户数据可广泛获得,而且在很大程度上实现了标准化;财务报表内此类分类账账户可以通过一棵树,即一种特殊类型的图表来表示,既代表分类账账户的价值,又代表它们之间的关系。鉴于其广泛可得性和丰富的信息内容,财务报表构成一个主要数据来源,据以计算公司相似或距离。在本文件中,我们提出一个图表距离指标,使人们能够计算两家公司财务报表之间的相似性。我们利用现实世界金融数据进行全面的实验研究,以证明我们拟议的远程计量的有用性。实验结果显示一些使用案例的可喜结果。这种方法对寻找投资机会的投资者、试图查明欺诈公司的政府官员和根据公司的财务报表为一组公司基准的会计师可能有用。