Most computer vision research focuses on datasets containing thousands of images of commonplace objects. However, many high-impact datasets, such as those in medicine and the geosciences, contain fine-grain objects that require domain-expert knowledge to recognize and are time-consuming to collect and annotate. As a result, these datasets contain few labeled images, and current machine vision models cannot train intensively on them. Originally introduced to correct large-language models, model-editing techniques in machine learning have been shown to improve model performance using only small amounts of data and additional training. Using a Mask R-CNN to segment ancient reef fossils in rock sample images, we present a two-part paradigm to improve fossil segmentation with few labeled images: we first identify model weaknesses using image perturbations and then mitigate those weaknesses using model editing. Specifically, we apply domain-informed image perturbations to expose the Mask R-CNN's inability to distinguish between different classes of fossils and its inconsistency in segmenting fossils with different textures. To address these shortcomings, we extend an existing model-editing method for correcting systematic mistakes in image classification to image segmentation with no additional labeled data needed and show its effectiveness in decreasing confusion between different kinds of fossils. We also highlight the best settings for model editing in our situation: making a single edit using all relevant pixels in one image (vs. using multiple images, multiple edits, or fewer pixels). Though we focus on fossil segmentation, our approach may be useful in other similar fine-grain segmentation problems where data is limited.
翻译:计算机视觉的研究大多集中在包含数千张常见物体图像的数据集上。然而,许多高影响力的数据集(如医学和地球科学数据集)涵盖了需要领域专家知识才能识别并且收集和注释耗时的微小对象。因此,这些数据集只包含少量带标签图像,且当前的机器视觉模型不能够密集地在此类数据上进行训练。人们最初引入了模型编辑技术以纠正大型语言模型,但已经证明,这些技术能够使用少量数据和额外训练来提高模型性能。我们使用Mask R-CNN对岩石样本图像中的古海礁化石进行分割,提出了一种通过少量带标签图像改进化石分割的二部曲方案:我们首先使用图像扰动识别模型的弱点,然后使用模型编辑方法减轻这些弱点。具体来说,我们应用领域知识的图像扰动,暴露出Mask R-CNN无法区分不同类型的化石并且在分割具有不同纹理的化石时不一致的问题。为了解决这些问题,我们将一种现有的模型编辑方法(用于纠正图像分类中的系统错误)推广到图像分割领域,无需额外的带标签数据,并展示其在减少不同类型的化石混淆方面的有效性。我们还强调了我们这种情况下模型编辑的最佳设置:使用一张图像中所有相关像素进行单次编辑处理(而不是使用多张图像、多次编辑处理或更少的像素)。尽管我们专注于化石分割,但我们的方法也可能在其他类似的细粒度分割问题上提供帮助,当这些问题的数据受到限制时。