SAM3-UNet：Segment Anything Model 3的简化适配方案 (SAM3-UNet: Simplified Adaptation of Segment Anything Model 3) - 专知论文

会员服务 ·

0

UNet · 适配 · MoDELS · 下游任务 · 包含 ·

SAM3-UNet: Simplified Adaptation of Segment Anything Model 3

翻译：SAM3-UNet：Segment Anything Model 3的简化适配方案

Xinyu Xiong,Zihuang Wu,Lei Lu,Yufa Xia

from arxiv, Technical Report

In this paper, we introduce SAM3-UNet, a simplified variant of Segment Anything Model 3 (SAM3), designed to adapt SAM3 for downstream tasks at a low cost. Our SAM3-UNet consists of three components: a SAM3 image encoder, a simple adapter for parameter-efficient fine-tuning, and a lightweight U-Net-style decoder. Preliminary experiments on multiple tasks, such as mirror detection and salient object detection, demonstrate that the proposed SAM3-UNet outperforms the prior SAM2-UNet and other state-of-the-art methods, while requiring less than 6 GB of GPU memory during training with a batch size of 12. The code is publicly available at https://github.com/WZH0120/SAM3-UNet.

翻译：本文提出SAM3-UNet，作为Segment Anything Model 3（SAM3）的简化变体，旨在以较低成本将SAM3适配至下游任务。我们的SAM3-UNet包含三个组件：SAM3图像编码器、用于参数高效微调的简单适配器，以及轻量级U-Net风格解码器。在镜像检测和显著目标检测等多个任务上的初步实验表明，所提出的SAM3-UNet在批大小为12的训练过程中仅需不足6 GB的GPU显存，其性能超越了先前的SAM2-UNet及其他先进方法。代码已公开于https://github.com/WZH0120/SAM3-UNet。

0

相关内容

UNet

《用于代码弱点识别的 LLVM 中间表示》CMU

《用于代码弱点识别的 LLVM 中间表示》CMU

专知会员服务

14+阅读 · 2022年12月12日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【AAAI2021】“可瘦身”的生成式对抗网络

【AAAI2021】“可瘦身”的生成式对抗网络

专知会员服务

13+阅读 · 2020年12月12日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

ICLR'21 | GNN联邦学习的新基准

ICLR'21 | GNN联邦学习的新基准

图与推荐

12+阅读 · 2021年11月15日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

RNN | RNN实践指南（2）

RNN | RNN实践指南（2）

KingsGarden

19+阅读 · 2017年5月4日

Caffe 深度学习框架上手教程

Caffe 深度学习框架上手教程

黑龙江大学自然语言处理实验室

14+阅读 · 2016年6月12日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders

Arxiv

0+阅读 · 12月15日

DMP-TTS: Disentangled multi-modal Prompting for Controllable Text-to-Speech with Chained Guidance

Arxiv

0+阅读 · 12月10日

CapsuleFS A Multi-credential DataCapsule Filesystem

Arxiv

0+阅读 · 12月8日

DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection

Arxiv

0+阅读 · 11月26日

JunoBench: A Benchmark Dataset of Crashes in Python Machine Learning Jupyter Notebooks

Arxiv

0+阅读 · 11月10日

VIP会员

文章信息

相关主题

相关VIP内容

《用于代码弱点识别的 LLVM 中间表示》CMU

《用于代码弱点识别的 LLVM 中间表示》CMU

专知会员服务

14+阅读 · 2022年12月12日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【AAAI2021】“可瘦身”的生成式对抗网络

【AAAI2021】“可瘦身”的生成式对抗网络

专知会员服务

13+阅读 · 2020年12月12日

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

【ACL2020-CMU-Google】MobileBERT:用于资源受限设备的任务无关“瘦版”BERT

专知会员服务

13+阅读 · 2020年4月9日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

热门VIP内容

开通专知VIP会员享更多权益服务

《北约认知战概念报告》

《预测促成大规模货运无人机的技术趋势与影响》报告

美海军放弃星座级转而采用国家安全巡逻舰设计

《北约作战弹性概念》报告

相关资讯

ICLR'21 | GNN联邦学习的新基准

ICLR'21 | GNN联邦学习的新基准

图与推荐

12+阅读 · 2021年11月15日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

RNN | RNN实践指南（2）

RNN | RNN实践指南（2）

KingsGarden

19+阅读 · 2017年5月4日

Caffe 深度学习框架上手教程

Caffe 深度学习框架上手教程

黑龙江大学自然语言处理实验室

14+阅读 · 2016年6月12日

相关论文

DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders

Arxiv

0+阅读 · 12月15日

DMP-TTS: Disentangled multi-modal Prompting for Controllable Text-to-Speech with Chained Guidance

Arxiv

0+阅读 · 12月10日

CapsuleFS A Multi-credential DataCapsule Filesystem

Arxiv

0+阅读 · 12月8日

DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection

Arxiv

0+阅读 · 11月26日

JunoBench: A Benchmark Dataset of Crashes in Python Machine Learning Jupyter Notebooks

Arxiv

0+阅读 · 11月10日

相关基金

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

MEMS数字地震检波器专用DSP芯片优化设计

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员