VinDr-SpineXR:通过放射线探测和分类脊椎损伤的深层学习框架 (VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs)

from arxiv, This is a preprint of our paper which was accepted for publication by the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021)

Radiographs are used as the most important imaging tool for identifying spine anomalies in clinical practice. The evaluation of spinal bone lesions, however, is a challenging task for radiologists. This work aims at developing and evaluating a deep learning-based framework, named VinDr-SpineXR, for the classification and localization of abnormalities from spine X-rays. First, we build a large dataset, comprising 10,468 spine X-ray images from 5,000 studies, each of which is manually annotated by an experienced radiologist with bounding boxes around abnormal findings in 13 categories. Using this dataset, we then train a deep learning classifier to determine whether a spine scan is abnormal and a detector to localize 7 crucial findings amongst the total 13. The VinDr-SpineXR is evaluated on a test set of 2,078 images from 1,000 studies, which is kept separate from the training set. It demonstrates an area under the receiver operating characteristic curve (AUROC) of 88.61% (95% CI 87.19%, 90.02%) for the image-level classification task and a mean average precision (mAP@0.5) of 33.56% for the lesion-level localization task. These results serve as a proof of concept and set a baseline for future research in this direction. To encourage advances, the dataset, codes, and trained deep learning models are made publicly available.

翻译：首先,我们建立了一个大型数据集,由5 000项研究的10,468个脊柱X射线图像组成,每个图像由拥有13类异常调查结果框的有经验的放射科医生手工附加说明。然而,对脊椎损伤的评估是放射科专家的一项艰巨任务。这项工作旨在开发和评价一个深层次学习框架,称为VinDr-SpineXR,用于脊椎X射线异常的分类和定位。这个框架的目的在于开发和评价一个叫做VinDr-SpineXR的深层次学习框架,用于对脊椎X射线异常进行分类和定位。首先,我们建立了一个大型数据集,由来自5 000项研究的10,46868个脊椎X射线图像组成,每个图像由拥有13类异常发现框的有经验的放射科医生手动附加说明。然后,我们培训一个深层次的学习分类师级分类师,以确定脊椎扫描仪扫描仪是否异常,一个探测器将总共7项关键调查结果本地化。 Vindr-SpinexXR在一组试验中评估了2,该模型的准确度。