Chest X-Ray (CXR) is one of the most common diagnostic techniques used in everyday clinical practice all around the world. We hereby present a work which intends to investigate and analyse the use of Deep Learning (DL) techniques to extract information from such images and allow to classify them, trying to keep our methodology as general as possible and possibly also usable in a real world scenario without much effort, in the future. To move in this direction, we trained several beta-Variational Autoencoder (beta-VAE) models on the CheXpert dataset, one of the largest publicly available collection of labeled CXR images; from these models, latent features have been extracted and used to train other Machine Learning models, able to classify the original images from the features extracted by the beta-VAE. Lastly, tree-based models have been combined together in ensemblings to improve the results without the necessity of further training or models engineering. Expecting some drop in pure performance with the respect to state of the art classification specific models, we obtained encouraging results, which show the viability of our approach and the usability of the high level features extracted by the autoencoders for classification tasks.
翻译:切斯特X-射线(CXR)是全世界日常临床实践中最常用的诊断技术之一。我们在此介绍一项工作,旨在调查和分析利用深学习(DL)技术从这些图像中提取信息并允许对其进行分类,试图尽可能将我们的方法保持在一般范围,今后在现实世界情景中也有可能使用,而无需再做很多努力。为了朝这个方向发展,我们培训了几个关于CheXpert数据集(Beta-VAE)的乙型自动编码器(beta-VaE)模型,这是可公开查阅的最大CXR图像集之一;从这些模型中提取了潜在特征,并用于培训其他机器学习模型,以便能够对乙型VAE所提取的特征的原始图像进行分类。最后,将植树型模型结合在一起,在不需要进一步培训或模型工程的情况下改进结果。期望在艺术分类特定模型方面出现一些纯性下降,我们取得了令人鼓舞的结果,表明我们的方法的可行性,以及由自动分析所提取的高层次特征的可用性。