Sample reweighting is an effective strategy for learning from training data coming from a mixture of subpopulations. In volumetric medical image segmentation, the data inputs are similarly distributed, but the associated data labels fall into two subpopulations -- "label-sparse" and "label-dense" -- depending on whether the data image occurs near the beginning/end of the volumetric scan or the middle. Existing reweighting algorithms have focused on hard- and soft- thresholding of the label-sparse data, which results in loss of information and reduced sample efficiency by discarding valuable data input. For this setting, we propose AdaWAC as an adaptive weighting algorithm that introduces a set of trainable weights which, at the saddle point of the underlying objective, assigns label-dense samples to supervised cross-entropy loss and label-sparse samples to unsupervised consistency regularization. We provide a convergence guarantee for AdaWAC by recasting the optimization as online mirror descent on a saddle point problem. Moreover, we empirically demonstrate that AdaWAC not only enhances segmentation performance and sample efficiency but also improves robustness to the subpopulation shift in labels.
翻译:样本再加权是一种有效的战略,用于从来自亚群混合体的培训数据中学习。在数量医学图像分割中,数据输入同样分布,但相关的数据标签可分为两个亚群 -- -- " 标签垃圾"和 " 标签敏感 " -- -- 取决于数据图像是在数量扫描的开始/结束或中间发生。现有的重新加权算法侧重于标签稀释数据的硬和软门槛,这导致信息丢失,通过丢弃有价值的数据输入降低了样本效率。在这个设置中,我们建议AdaWAC作为一种适应性加权算法,引入一套可训练的加权法,在基本目标的顶点,指定标签敏感样本监督跨作物损失和标签稀释样本,以不受监督的一致性规范。我们通过重新将优化作为在线镜落在马鞍点上的结果,为AdaWACC提供了趋同的保证。此外,我们从经验上表明,AdaWACC不仅能提高分化性性能和抽样效率,而且还能提高分层结构变化的稳健性,从而保证AdaWAC公司在马垫点上重。