This study presents a novel approach to bone age assessment (BAA) using a multi-view, multi-task classification model based on the Sauvegrain method. A straightforward solution to automating the Sauvegrain method, which assesses a maturity score for each landmark in the elbow and predicts the bone age, is to train classifiers independently to score each region of interest (RoI), but this approach limits the accessible information to local morphologies and increases computational costs. As a result, this work proposes a self-accumulative vision transformer (SAT) that mitigates anisotropic behavior, which usually occurs in multi-view, multi-task problems and limits the effectiveness of a vision transformer, by applying token replay and regional attention bias. A number of experiments show that SAT successfully exploits the relationships between landmarks and learns global morphological features, resulting in a mean absolute error of BAA that is 0.11 lower than that of the previous work. Additionally, the proposed SAT has four times reduced parameters than an ensemble of individual classifiers of the previous work. Lastly, this work also provides informative implications for clinical practice, improving the accuracy and efficiency of BAA in diagnosing abnormal growth in adolescents.
翻译:本研究提出了一种新颖的方法来进行骨龄评估(BAA),它使用基于Sauvegrain方法的多视角,多任务分类模型。自动化Sauvegrain方法的一种直观解决方案是独立训练分类器来为每个关键点评分,并预测骨龄。但这种方法局限于局部形态学信息,并增加了计算成本。因此,本文提出了一种自积累视觉转换器(SAT),通过应用令牌重放和区域注意偏差来缓解多视角多任务问题的各向异性行为,这种行为通常限制了视觉转换器的有效性。一些实验表明,SAT成功地利用了关键点之间的关系,并学习了全局形态学特征。与先前工作相比,骨龄评估的平均绝对误差降低了0.11。此外,SAT的参数数目比前一项工作的单独分类器集合减少了四倍。最后,本研究还为临床实践提供了有意义的启示,提高了在诊断青春期异常生长方面的 BAA 的准确性和效率。