X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion - 专知论文

会员服务 ·

0

示例 · MoDELS · 缩放 · 掩码 · Performer ·

2023 年 5 月 31 日

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

翻译：暂无翻译

Hanqing Zhao,Dianmo Sheng,Jianmin Bao,Dongdong Chen,Dong Chen,Fang Wen,Lu Yuan,Ce Liu,Wenbo Zhou,Qi Chu,Weiming Zhang,Nenghai Yu

from arxiv, ICML 2023, code is available at https://github.com/yoctta/XPaste

Copy-Paste is a simple and effective data augmentation strategy for instance segmentation. By randomly pasting object instances onto new background images, it creates new training data for free and significantly boosts the segmentation performance, especially for rare object categories. Although diverse, high-quality object instances used in Copy-Paste result in more performance gain, previous works utilize object instances either from human-annotated instance segmentation datasets or rendered from 3D object models, and both approaches are too expensive to scale up to obtain good diversity. In this paper, we revisit Copy-Paste at scale with the power of newly emerged zero-shot recognition models (e.g., CLIP) and text2image models (e.g., StableDiffusion). We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable. To make such success happen, we design a data acquisition and processing framework, dubbed ``X-Paste", upon which a systematic study is conducted. On the LVIS dataset, X-Paste provides impressive improvements over the strong baseline CenterNet2 with Swin-L as the backbone. Specifically, it archives +2.6 box AP and +2.1 mask AP gains on all classes and even more significant gains with +6.8 box AP, +6.5 mask AP on long-tail classes. Our code and models are available at https://github.com/yoctta/XPaste.

翻译：暂无翻译

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

人源PCL家族蛋白参与表观遗传调控的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

肌酸激酶与CC2D1A和NF-κB相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米复合结构的稀土掺杂透明玻璃陶瓷中表面等离子体共振增强的量子剪裁发光

国家自然科学基金

0+阅读 · 2012年12月31日

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Arxiv

0+阅读 · 2023年7月20日

Mining of Single-Class by Active Learning for Semantic Segmentation

Arxiv

0+阅读 · 2023年7月18日

Frequency-mixed Single-source Domain Generalization for Medical Image Segmentation

Arxiv

0+阅读 · 2023年7月18日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Arxiv

0+阅读 · 2023年7月20日

Mining of Single-Class by Active Learning for Semantic Segmentation

Arxiv

0+阅读 · 2023年7月18日

Frequency-mixed Single-source Domain Generalization for Medical Image Segmentation

Arxiv

0+阅读 · 2023年7月18日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

人源PCL家族蛋白参与表观遗传调控的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

肌酸激酶与CC2D1A和NF-κB相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米复合结构的稀土掺杂透明玻璃陶瓷中表面等离子体共振增强的量子剪裁发光

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员