Heart failure remains a major public health challenge with growing costs. Ejection fraction (EF) is a key metric for the diagnosis and management of heart failure however estimation of EF using echocardiography remains expensive for the healthcare system and subject to intra/inter operator variability. While chest x-rays (CXR) are quick, inexpensive, and require less expertise, they do not provide sufficient information to the human eye to estimate EF. This work explores the efficacy of computer vision techniques to predict reduced EF solely from CXRs. We studied a dataset of 3488 CXRs from the MIMIC CXR-jpg (MCR) dataset. Our work establishes benchmarks using multiple state-of-the-art convolutional neural network architectures. The subsequent analysis shows increasing model sizes from 8M to 23M parameters improved classification performance without overfitting the dataset. We further show how data augmentation techniques such as CXR rotation and random cropping further improves model performance another ~5%. Finally, we conduct an error analysis using saliency maps and Grad-CAMs to better understand the failure modes of convolutional models on this task.
翻译:心脏衰竭仍然是公众健康的一大挑战,其成本不断增长。 弹出分数(EF)是诊断和管理心脏衰竭的关键衡量标准。 但是,使用回声心电图对EF的估计对于保健系统来说仍然昂贵,而且取决于操作者内部/内部的变异性。虽然胸X射线(CXR)是快速、廉价的,需要的专业知识较少,但它们没有为人眼睛提供足够的信息来估计EF。 这项工作探索计算机视觉技术的功效,只从 CXRs 预测减少 EF。 我们研究了MIMIC CXR- jpg (MCR) 数据集中的3488 CXRs。 我们的工作利用多种状态的电动神经网络结构建立了基准。 后续分析显示,模型尺寸从8M 到23M 参数不断提高的分类性能,而没有过度匹配数据集。 我们进一步展示了CXR 旋转和随机裁剪等数据增强技术如何进一步提高模型的效能。 最后,我们利用显著的地图和 Grad-CAMs 来进行错误分析,以更好地了解这一革命模型的失败模式。