The primary aim of this research was to find a model that best predicts which fallen angel bonds would either potentially rise up back to investment grade bonds and which ones would fall into bankruptcy. To implement the solution, we thought that the ideal method would be to create an optimal machine learning model that could predict bankruptcies. Among the many machine learning models out there we decided to pick four classification methods: logistic regression, KNN, SVM, and NN. We also utilized an automated methods of Google Cloud's machine learning. The results of our model comparisons showed that the models did not predict bankruptcies very well on the original data set with the exception of Google Cloud's machine learning having a high precision score. However, our over-sampled and feature selection data set did perform very well. This could likely be due to the model being over-fitted to match the narrative of the over-sampled data (as in, it does not accurately predict data outside of this data set quite well). Therefore, we were not able to create a model that we are confident that would predict bankruptcies. However, we were able to find value out of this project in two key ways. The first is that Google Cloud's machine learning model in every metric and in every data set either outperformed or performed on par with the other models. The second is that we found that utilizing feature selection did not reduce predictive power that much. This means that we can reduce the amount of data to collect for future experimentation regarding predicting bankruptcies.
翻译:这项研究的主要目的是找到一个模型,最佳地预测哪些天使债券会下降,哪些是可能回升到投资等级债券,哪些会陷入破产。为了实施解决方案,我们认为理想的方法将是创建一个最佳机器学习模型,从而预测破产。在众多的机器学习模型中,我们决定选择四种分类方法:后勤回归、KNN、SVM和NN。我们还使用了谷歌云机器学习的自动化方法。我们的模型比较结果表明,模型没有很好地预测原始数据集的破产,除了谷歌云的机器学习高精度分之外。然而,我们过度抽样和特征选择的数据集确实运行得很好。这可能是由于模型过于适合与过度抽样数据描述的描述(例如,它并不准确预测数据之外的数据 ) 。 因此,我们无法创建一种我们有信心预测破产的模型。然而,我们无法在谷歌云的机器学习中以两种关键方式减少这个项目的价值。我们通过每个关键的方式来学习每个模型。