The emergence of large models, also known as foundation models, has brought significant advancements to AI research. One such model is Segment Anything (SAM), which is designed for image segmentation tasks. However, as with other foundation models, our experimental findings suggest that SAM may fail or perform poorly in certain segmentation tasks, such as shadow detection and camouflaged object detection (concealed object detection). This study first paves the way for applying the large pre-trained image segmentation model SAM to these downstream tasks, even in situations where SAM performs poorly. Rather than fine-tuning the SAM network, we propose \textbf{SAM-Adapter}, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters. Our extensive experiments show that SAM-Adapter can significantly elevate the performance of SAM in challenging tasks and we can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection and shadow detection. We believe our work opens up opportunities for utilizing SAM in downstream tasks, with potential applications in various fields, including medical image processing, agriculture, remote sensing, and more.
翻译:大型模型(也称为基础模型)的出现为AI研究带来了重大的进展。其中一种模型是针对图像分割任务设计的Segment Anything(SAM)模型。然而,与其他基础模型一样,我们的实验结果表明,在某些分割任务(例如阴影检测和伪装物体检测(隐藏物体检测))中,SAM可能会失败或表现不佳。本研究首先为将大型预训练的图像分割模型SAM应用于这些下行任务铺平了道路,即使在SAM表现不佳的情况下。我们提出了\textbf{SAM-Adapter},它通过使用简单而有效的适配器将领域特定信息或视觉提示纳入分割网络,从而提高SAM在具有挑战性的任务中的性能。我们的大量实验表明,SAM-Adapter可以显著提高SAM在挑战性任务中的性能,甚至可以胜过任务特定的网络模型,并在我们测试的任务中取得了最先进的表现:伪装物体检测和阴影检测。我们相信我们的工作为在各个领域(包括医学影像处理、农业、遥感等)中利用SAM进行下行任务开辟了机会。