The emergence of large models, also known as foundation models, has brought significant advancements to AI research. One such model is Segment Anything (SAM), which is designed for image segmentation tasks. However, as with other foundation models, our experimental findings suggest that SAM may fail or perform poorly in certain segmentation tasks, such as shadow detection and camouflaged object detection (concealed object detection). This study first paves the way for applying the large pre-trained image segmentation model SAM to these downstream tasks, even in situations where SAM performs poorly. Rather than fine-tuning the SAM network, we propose \textbf{SAM-Adapter}, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters. Our extensive experiments show that SAM-Adapter can significantly elevate the performance of SAM in challenging tasks and we can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection and shadow detection. We believe our work opens up opportunities for utilizing SAM in downstream tasks, with potential applications in various fields, including medical image processing, agriculture, remote sensing, and more.
翻译:----
大型模型(也称为基础模型)的出现为AI研究带来了巨大的进步。其中之一是用于图像分割任务的Segment Anything(SAM)模型。然而,与其他基础模型一样,我们的实验发现SAM在某些分割任务中可能会失败或表现不佳,例如阴影检测和伪装物体检测(隐蔽物体检测)。本研究首先为将大型预训练的图像分割模型SAM应用于这些下游任务铺平了道路,即使在SAM表现不佳的情况下。我们提出了SAM-Adapter,而非微调SAM网络,通过使用简单但有效的适配器将领域特定信息或视觉提示纳入分割网络中。我们广泛的实验表明,SAM-Adapter可以显著提高SAM在具有挑战性的任务中的性能,我们甚至可以超越任务特定的网络模型,在我们测试的任务中取得最先进的性能:伪装物体检测和阴影检测。我们相信我们的工作为在下游任务中利用SAM开辟了机会,具有医学图像处理、农业、遥感等各种领域的潜在应用。