技术报告:利用临床CT进行机器学习的质量评估工具 (Technical Report: Quality Assessment Tool for Machine Learning with Clinical CT)

Image Quality Assessment (IQA) is important for scientific inquiry, especially in medical imaging and machine learning. Potential data quality issues can be exacerbated when human-based workflows use limited views of the data that may obscure digital artifacts. In practice, multiple factors such as network issues, accelerated acquisitions, motion artifacts, and imaging protocol design can impede the interpretation of image collections. The medical image processing community has developed a wide variety of tools for the inspection and validation of imaging data. Yet, IQA of computed tomography (CT) remains an under-recognized challenge, and no user-friendly tool is commonly available to address these potential issues. Here, we create and illustrate a pipeline specifically designed to identify and resolve issues encountered with large-scale data mining of clinically acquired CT data. Using the widely studied National Lung Screening Trial (NLST), we have identified approximately 4% of image volumes with quality concerns out of 17,392 scans. To assess robustness, we applied the proposed pipeline to our internal datasets where we find our tool is generalizable to clinically acquired medical images. In conclusion, the tool has been useful and time-saving for research study of clinical data, and the code and tutorials are publicly available at https://github.com/MASILab/QA_tool.

翻译：图像质量评估(IQA)对于科学研究十分重要,特别是在医学成像和机器学习方面。当人类工作流程对可能掩盖数字文物的数据使用有限的观点时,潜在的数据质量问题可能会加剧。实际上,网络问题、加速获取、运动文物和成像协议设计等多种因素会妨碍对图像收藏的理解。医学图像处理界开发了各种用于检查和验证成像数据的工具。然而,计算成像摄影学(CT)的成像学(IQA)仍然是人们认识不足的挑战,通常没有方便用户的工具来应对这些潜在问题。在这里,我们创建和展示一个专门设计用于查明和解决在大规模数据挖掘临床获得的CT数据时遇到的问题的管道。我们利用广泛研究的全国肺部检查试验(NLST),确定了大约4%的图像量,质量问题在17,392个扫描中。为了评估稳健性,我们将拟议的管道应用于我们的内部数据集,在那里发现我们的工具可以被普遍用于临床获取的医疗图像。最后,该工具是有用的和时间节制成的。该工具,用于对临床数据进行公开研究。MISQ。在公开研究时,该工具是有用的和时间节制。