事件相机数据预训练 (Event Camera Data Pre-training) - 专知论文

会员服务 ·

0

事件相机 · 事件 · RGB图像 · 相似性 · 嵌入 ·

2023 年 4 月 5 日

Event Camera Data Pre-training

翻译：事件相机数据预训练

Yan Yang,Liyuan Pan,Liu Liu

This paper proposes a pre-trained neural network for handling event camera data. Our model is a self-supervised learning framework, and uses paired event camera data and natural RGB images for training. Our method contains three modules connected in a sequence: i) a family of event data augmentations, generating meaningful event images for self-supervised training; ii) a conditional masking strategy to sample informative event patches from event images, encouraging our model to capture the spatial layout of a scene and accelerating training; iii) a contrastive learning approach, enforcing the similarity of embeddings between matching event images, and between paired event and RGB images. An embedding projection loss is proposed to avoid the model collapse when enforcing the event image embedding similarities. A probability distribution alignment loss is proposed to encourage the event image to be consistent with its paired RGB image in the feature space. Transfer learning performance on downstream tasks shows the superiority of our method over state-of-the-art methods. For example, we achieve top-1 accuracy at 64.83% on the N-ImageNet dataset.

翻译：本文提出了一个用于处理事件相机数据的预训练神经网络。我们的模型是一个自监督学习框架，使用成对的事件相机数据和自然RGB图像进行训练。我们的方法包含三个按顺序连接的模块：i）一个事件数据增强方法族，生成有意义的事件图像进行自监督训练；ii) 一种条件性遮盖策略，从事件图像中抽样有信息量的事件块，鼓励我们的模型捕捉场景的空间布局并加快训练；iii）一种对比学习方法，强制匹配事件图像和匹配的事件与RGB图像之间的嵌入相似性。在强制事件图像嵌入相似性时，我们提出了一种嵌入投影损失，以避免模型崩溃。我们还提出了一种概率分布对齐损失，以鼓励事件图像在特征空间中与其配对的RGB图像一致。在下游任务的迁移学习性能方面，我们的方法优于最先进的方法。例如，在 N-ImageNet 数据集上，我们实现了64.83％的前1个准确度。

0

相关内容

事件相机

【CVPR2022】以人为中心感知的多模态预训练

【CVPR2022】以人为中心感知的多模态预训练

专知会员服务

30+阅读 · 2022年3月28日

【CVPR2022】三元组对比学习的视觉-语言预训练

【CVPR2022】三元组对比学习的视觉-语言预训练

专知会员服务

33+阅读 · 2022年3月3日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

专知会员服务

18+阅读 · 2022年2月26日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【ICML2020-Google】预训练提取的空白句子以便进行抽象摘要

【ICML2020-Google】预训练提取的空白句子以便进行抽象摘要

专知会员服务

20+阅读 · 2020年7月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

ACL 2022 | 跨模态离散化表示学习：让不同的模态共享相同的词表

ACL 2022 | 跨模态离散化表示学习：让不同的模态共享相同的词表

PaperWeekly

0+阅读 · 2022年7月8日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

高速列车-浮置板轨道-高架桥耦合系统空间动力学行为分析及减振研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于波长分幅和参量放大的超快多幅实时成像技术的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于网上弱标注数据的个性化图像标注研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于长期动态应变监测数据的大跨钢桥时变疲劳可靠度评估

国家自然科学基金

1+阅读 · 2013年12月31日

全基因组DNA甲基化研究中的统计学方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于线性调谐宽带CDTA的电流模式连续时间可重构模拟阵列

国家自然科学基金

0+阅读 · 2012年12月31日

时空异步关联规则挖掘的模型和算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

绕线式无刷双馈脉冲发电机研究

国家自然科学基金

0+阅读 · 2012年12月31日

大跨度波纹钢腹板PC箱梁桥疲劳特性实验及数值分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于低维非线性结构的高光谱图像异常检测技术

国家自然科学基金

1+阅读 · 2009年12月31日

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

Arxiv

0+阅读 · 2023年5月25日

A Diffusion Probabilistic Prior for Low-Dose CT Image Denoising

Arxiv

0+阅读 · 2023年5月25日

Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation

Arxiv

0+阅读 · 2023年5月23日

Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses

Arxiv

0+阅读 · 2023年5月23日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2022】以人为中心感知的多模态预训练

【CVPR2022】以人为中心感知的多模态预训练

专知会员服务

30+阅读 · 2022年3月28日

【CVPR2022】三元组对比学习的视觉-语言预训练

【CVPR2022】三元组对比学习的视觉-语言预训练

专知会员服务

33+阅读 · 2022年3月3日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

AAAI 2022 | 基于预训练-微调框架的图像差异描述任务

专知会员服务

18+阅读 · 2022年2月26日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【ICML2020-Google】预训练提取的空白句子以便进行抽象摘要

【ICML2020-Google】预训练提取的空白句子以便进行抽象摘要

专知会员服务

20+阅读 · 2020年7月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

【北京大学】探索提取跨模态信息进行图像caption，Exploring and Distilling Cross-Modal Information for Image Captioning

专知会员服务

54+阅读 · 2020年3月3日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

ACL 2022 | 跨模态离散化表示学习：让不同的模态共享相同的词表

ACL 2022 | 跨模态离散化表示学习：让不同的模态共享相同的词表

PaperWeekly

0+阅读 · 2022年7月8日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

自适应注意力机制在Image Caption中的应用

自适应注意力机制在Image Caption中的应用

PaperWeekly

10+阅读 · 2018年5月10日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量

专知

29+阅读 · 2018年3月12日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

相关论文

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

Arxiv

0+阅读 · 2023年5月25日

A Diffusion Probabilistic Prior for Low-Dose CT Image Denoising

Arxiv

0+阅读 · 2023年5月25日

Large Language Models are Frame-level Directors for Zero-shot Text-to-Video Generation

Arxiv

0+阅读 · 2023年5月23日

Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses

Arxiv

0+阅读 · 2023年5月23日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Arxiv

19+阅读 · 2020年2月15日

End-to-End Dense Video Captioning with Masked Transformer

Arxiv

14+阅读 · 2018年4月3日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

高速列车-浮置板轨道-高架桥耦合系统空间动力学行为分析及减振研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于波长分幅和参量放大的超快多幅实时成像技术的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于网上弱标注数据的个性化图像标注研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于长期动态应变监测数据的大跨钢桥时变疲劳可靠度评估

国家自然科学基金

1+阅读 · 2013年12月31日

全基因组DNA甲基化研究中的统计学方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于线性调谐宽带CDTA的电流模式连续时间可重构模拟阵列

国家自然科学基金

0+阅读 · 2012年12月31日

时空异步关联规则挖掘的模型和算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

绕线式无刷双馈脉冲发电机研究

国家自然科学基金

0+阅读 · 2012年12月31日

大跨度波纹钢腹板PC箱梁桥疲劳特性实验及数值分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于低维非线性结构的高光谱图像异常检测技术

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员