Deep learning models have achieved remarkable success in computer vision, but they still rely heavily on large-scale labeled data and tend to overfit when data are limited or distributions shift. Data augmentation, particularly mask-based information dropping, can enhance robustness by forcing models to explore complementary cues; however, existing approaches often lack structural awareness and may discard essential semantics. We propose Granular-ball Guided Masking (GBGM), a structure-aware augmentation strategy guided by Granular-ball Computing (GBC). GBGM adaptively preserves semantically rich, structurally important regions while suppressing redundant areas through a coarse-to-fine hierarchical masking process, producing augmentations that are both representative and discriminative. Extensive experiments on multiple benchmarks demonstrate consistent improvements in classification accuracy and masked image reconstruction, confirming the effectiveness and broad applicability of the proposed method. Simple and model-agnostic, it integrates seamlessly into CNNs and Vision Transformers and provides a new paradigm for structure-aware data augmentation.
翻译:深度学习模型在计算机视觉领域取得了显著成功,但仍严重依赖大规模标注数据,且在数据有限或分布偏移时容易过拟合。数据增强,特别是基于掩码的信息丢弃,可通过迫使模型探索互补线索来增强鲁棒性;然而,现有方法往往缺乏结构感知能力,可能丢弃关键语义信息。我们提出基于粒度球引导的掩码(GBGM),这是一种由粒度球计算(GBC)引导的结构感知增强策略。GBGM通过从粗到细的层次化掩码过程,自适应地保留语义丰富、结构重要的区域,同时抑制冗余区域,从而生成既具代表性又具判别性的增强样本。在多个基准数据集上的大量实验表明,该方法在分类精度和掩码图像重建方面均取得了一致的提升,证实了所提方法的有效性和广泛适用性。该方法简单且与模型无关,可无缝集成到CNN和Vision Transformer中,为结构感知的数据增强提供了新的范式。