A desire to achieve large medical imaging datasets keeps increasing as machine learning algorithms, parallel computing, and hardware technology evolve. Accordingly, there is a growing demand in pooling data from multiple clinical and academic institutes to enable large-scale clinical or translational research studies. Magnetic resonance imaging (MRI) is a frequently used, non-invasive imaging modality. However, constructing a big MRI data repository has multiple challenges related to privacy, data size, DICOM format, logistics, and non-standardized images. Not only building the data repository is difficult, but using data pooled from the repository is also challenging, due to heterogeneity in image acquisition, reconstruction, and processing pipelines across MRI vendors and imaging sites. This position paper describes challenges in constructing a large MRI data repository and using data downloaded from such data repositories in various aspects. To help address the challenges, the paper proposes introducing a quality assessment pipeline, with considerations and general design principles.
翻译:随着机器学习算法、平行计算和硬件技术的发展,实现大型医疗成像数据集的愿望不断增长,因此,对从多个临床和学术机构汇集数据以进行大规模临床或翻译研究的需求不断增加,磁共振成像(MRI)是一种常用的非侵入性成像模式,然而,随着机器学习算法、平行计算和硬件技术的发展,建立一个大型的MRI数据储存库在隐私、数据大小、DICOM格式、物流和非标准化图像方面面临着多重挑战。不仅很难建立数据储存库,而且由于图像获取、重建和处理管道在MRI供应商和成像站之间的差异,利用数据储存库收集的数据也具有挑战性。本立场文件描述了在建造大型MRI数据储存库和使用从这类数据储存库下载的数据在各个方面的挑战。为了帮助应对挑战,文件提议采用质量评估管道,并附有考虑和一般设计原则。