ViTs for SITS: Vision Transformers for Satellite Image Time Series (ViTs for SITS: Vision Transformers for Satellite Image Time Series) - 专知论文

会员服务 ·

0

分解 · 视觉Transformer · 变换 · 分割 · Vision ·

2023 年 4 月 14 日

ViTs for SITS: Vision Transformers for Satellite Image Time Series

翻译：ViTs for SITS: Vision Transformers for Satellite Image Time Series

Michail Tarasiou,Erik Chavez,Stefanos Zafeiriou

from arxiv, 11 pages, 5 figures, 2 tables

In this paper we introduce the Temporo-Spatial Vision Transformer (TSViT), a fully-attentional model for general Satellite Image Time Series (SITS) processing based on the Vision Transformer (ViT). TSViT splits a SITS record into non-overlapping patches in space and time which are tokenized and subsequently processed by a factorized temporo-spatial encoder. We argue, that in contrast to natural images, a temporal-then-spatial factorization is more intuitive for SITS processing and present experimental evidence for this claim. Additionally, we enhance the model's discriminative power by introducing two novel mechanisms for acquisition-time-specific temporal positional encodings and multiple learnable class tokens. The effect of all novel design choices is evaluated through an extensive ablation study. Our proposed architecture achieves state-of-the-art performance, surpassing previous approaches by a significant margin in three publicly available SITS semantic segmentation and classification datasets. All model, training and evaluation codes are made publicly available to facilitate further research.

翻译：本论文介绍了基于视觉Transformer（ViT）的时间空间视觉Transformer（TSViT）模型，用于通用卫星图像时间序列（SITS）处理。TSViT将SITS记录分割成非重叠的空间和时间补丁，对其进行标记化，然后通过分解的时间空间编码器进行处理。与自然图像不同，我们认为一个时间-空间分解对于SITS处理更加直观，并通过实验证据支持了这个观点。此外，我们通过引入两种新的机制来获取时间位置编码和多个可学习类令牌，增强了模型的区分能力。通过广泛的消融研究评估了所有新设计选择的效果。我们提出的体系结构在三个公共的SITS语义分割和分类数据集中取得了显着的优势，超过了以前的方法。所有的模型、训练和评估代码都是公开的，以促进进一步的研究。

0

相关内容

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【加拿大Sherbrooke】金融时间序列表示学习，Financial Time Series RL

【加拿大Sherbrooke】金融时间序列表示学习，Financial Time Series RL

专知会员服务

44+阅读 · 2020年3月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【阿里巴巴-达摩院】深度学习的时间序列数据增强综述，Time Series Data Augmentation for Deep Learning: A Survey

【阿里巴巴-达摩院】深度学习的时间序列数据增强综述，Time Series Data Augmentation for Deep Learning: A Survey

专知会员服务

134+阅读 · 2020年3月2日

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

专知会员服务

25+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

大气压DBD冷等离子体制备金基双金属催化剂及其机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

硒吩-EDOT类共聚物热电新材料的制备及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

POLD1基因的癌性表达及其在乳腺癌中对细胞恶性表型的影响

国家自然科学基金

0+阅读 · 2012年12月31日

CUEDC2调控细胞周期G1/S期转化及细胞生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

时效镁合金的沉淀析出与强韧化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

LIM同源结构域转录因子Isl1在自主神经系统发育中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Hexamerin基因家族在飞蝗型变过程中的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

A Multi-Modal Transformer Network for Action Detection

Arxiv

0+阅读 · 2023年5月31日

Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation

Arxiv

1+阅读 · 2023年5月30日

Multi-Scale Attention for Audio Question Answering

Arxiv

0+阅读 · 2023年5月29日

Exploring Self-Attention Mechanisms for Speech Separation

Arxiv

0+阅读 · 2023年5月27日

PAD-Net: An Efficient Framework for Dynamic Networks

Arxiv

0+阅读 · 2023年5月26日

Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers

Arxiv

0+阅读 · 2023年5月26日

SyreaNet: A Physically Guided Underwater Image Enhancement Framework Integrating Synthetic and Real Images

Arxiv

0+阅读 · 2023年5月25日

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

Arxiv

0+阅读 · 2023年5月24日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

VIP会员

文章信息

相关主题

视觉Transformer

相关VIP内容

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【加拿大Sherbrooke】金融时间序列表示学习，Financial Time Series RL

【加拿大Sherbrooke】金融时间序列表示学习，Financial Time Series RL

专知会员服务

44+阅读 · 2020年3月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【阿里巴巴-达摩院】深度学习的时间序列数据增强综述，Time Series Data Augmentation for Deep Learning: A Survey

【阿里巴巴-达摩院】深度学习的时间序列数据增强综述，Time Series Data Augmentation for Deep Learning: A Survey

专知会员服务

134+阅读 · 2020年3月2日

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

【AAAI2020论文】关注实体以更好地理解文本（Attending to Entities for Better Text Understanding）

专知会员服务

25+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

A Multi-Modal Transformer Network for Action Detection

Arxiv

0+阅读 · 2023年5月31日

Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation

Arxiv

1+阅读 · 2023年5月30日

Multi-Scale Attention for Audio Question Answering

Arxiv

0+阅读 · 2023年5月29日

Exploring Self-Attention Mechanisms for Speech Separation

Arxiv

0+阅读 · 2023年5月27日

PAD-Net: An Efficient Framework for Dynamic Networks

Arxiv

0+阅读 · 2023年5月26日

Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers

Arxiv

0+阅读 · 2023年5月26日

SyreaNet: A Physically Guided Underwater Image Enhancement Framework Integrating Synthetic and Real Images

Arxiv

0+阅读 · 2023年5月25日

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

Arxiv

0+阅读 · 2023年5月24日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

相关基金

大气压DBD冷等离子体制备金基双金属催化剂及其机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

硒吩-EDOT类共聚物热电新材料的制备及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

POLD1基因的癌性表达及其在乳腺癌中对细胞恶性表型的影响

国家自然科学基金

0+阅读 · 2012年12月31日

CUEDC2调控细胞周期G1/S期转化及细胞生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

时效镁合金的沉淀析出与强韧化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

LIM同源结构域转录因子Isl1在自主神经系统发育中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Hexamerin基因家族在飞蝗型变过程中的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员