Accurate understanding of anatomical structures is essential for reliably staging certain dental diseases. A way of introducing this within semantic segmentation models is by utilising hierarchy-aware methodologies. However, existing hierarchy-aware segmentation methods largely encode anatomical structure through the loss functions, providing weak and indirect supervision. We introduce a general framework that embeds an explicit anatomical hierarchy into semantic segmentation by coupling a recurrent, level-wise prediction scheme with restrictive output heads and top-down feature conditioning. At each depth of the class tree, the backbone is re-run on the original image concatenated with logits from the previous level. Child class features are conditioned using Feature-wise Linear Modulation of their parent class probabilities, to modulate child feature spaces for fine grained detection. A probabilistic composition rule enforces consistency between parent and descendant classes. Hierarchical loss combines per-level class weighted Dice and cross entropy loss and a consistency term loss, ensuring parent predictions are the sum of their children. We validate our approach on our proposed dataset, TL-pano, containing 194 panoramic radiographs with dense instance and semantic segmentation annotations, of tooth layers and alveolar bone. Utilising UNet and HRNet as donor models across a 5-fold cross validation scheme, the hierarchical variants consistently increase IoU, Dice, and recall, particularly for fine-grained anatomies, and produce more anatomically coherent masks. However, hierarchical variants also demonstrated increased recall over precision, implying increased false positives. The results demonstrate that explicit hierarchical structuring improves both performance and clinical plausibility, especially in low data dental imaging regimes.
翻译:准确理解解剖结构对于可靠分期某些牙科疾病至关重要。在语义分割模型中引入解剖结构的一种方法是利用层次感知方法。然而,现有的层次感知分割方法主要通过损失函数编码解剖结构,仅提供弱间接监督。我们提出一个通用框架,通过将循环逐级预测方案与限制性输出头及自上而下的特征条件化相结合,将显式解剖层次嵌入语义分割。在类别树的每个深度,主干网络会在原始图像与上一级逻辑输出的拼接结果上重新运行。子类特征通过其父类概率的特征级线性调制进行条件化,以调制子类特征空间实现细粒度检测。概率组合规则强制父类与后代类之间的一致性。分层损失结合了各级别类别加权Dice损失、交叉熵损失及一致性损失项,确保父类预测为其子类预测之和。我们在提出的TL-pano数据集上验证了该方法,该数据集包含194张全景X光片,具有牙体层次和牙槽骨的密集实例及语义分割标注。通过使用UNet和HRNet作为基础模型进行五折交叉验证,分层变体持续提升了IoU、Dice和召回率指标(特别是对细粒度解剖结构),并生成解剖一致性更高的掩码。然而,分层变体也表现出召回率增长高于精确度的现象,意味着假阳性增加。结果表明,显式层次结构能提升模型性能与临床合理性,尤其在数据稀缺的牙科影像场景中。