从视听设备中探测自动监督机动车辆 (Self-Supervised Moving Vehicle Detection from Audio-Visual Cues) - 专知论文

会员服务 ·

0

Extensibility · MoDELS · 数据集 · contrastive · 对比学习 ·

2022 年 1 月 30 日

Self-Supervised Moving Vehicle Detection from Audio-Visual Cues

翻译：从视听设备中探测自动监督机动车辆

Jannik Zürn,Wolfram Burgard

from arxiv, 8 pages, 6 figures

Robust detection of moving vehicles is a critical task for any autonomously operating outdoor robot or self-driving vehicle. Most modern approaches for solving this task rely on training image-based detectors using large-scale vehicle detection datasets such as nuScenes or the Waymo Open Dataset. Providing manual annotations is an expensive and laborious exercise that does not scale well in practice. To tackle this problem, we propose a self-supervised approach that leverages audio-visual cues to detect moving vehicles in videos. Our approach employs contrastive learning for localizing vehicles in images from corresponding pairs of images and recorded audio. In extensive experiments carried out with a real-world dataset, we demonstrate that our approach provides accurate detections of moving vehicles and does not require manual annotations. We furthermore show that our model can be used as a teacher to supervise an audio-only detection model. This student model is invariant to illumination changes and thus effectively bridges the domain gap inherent to models leveraging exclusively vision as the predominant modality.

翻译：对移动车辆进行强力探测是任何自主操作室外机器人或自行驾驶车辆的一项关键任务。解决这项任务的大多数现代方法都依赖于使用大型车辆探测数据集,如NuScenes或Waymo Open数据集,对图像探测器进行培训。提供人工说明是一项昂贵和艰苦的工作,实际上规模不高。为解决这一问题,我们建议一种自我监督的方法,利用视听线索探测视频中的移动车辆。我们的方法是用对应图像和录音图像的图像进行对比性学习,将车辆本地化。在用现实世界数据集进行的广泛试验中,我们证明我们的方法提供了对移动车辆的准确探测,不需要人工说明。我们进一步表明,我们的模型可以用作教师监督只听音的探测模型。这种学生模型不易产生照明变化,从而有效地弥合完全以视觉为主的模型所固有的领域差距。

0

相关内容

Extensibility

iOS 8 提供的应用间和应用跟系统的功能交互特性。

Today (iOS and OS X): widgets for the Today view of Notification Center
Share (iOS and OS X): post content to web services or share content with others
Actions (iOS and OS X): app extensions to view or manipulate inside another app
Photo Editing (iOS): edit a photo or video in Apple's Photos app with extensions from a third-party apps
Finder Sync (OS X): remote file storage in the Finder with support for Finder content annotation
Storage Provider (iOS): an interface between files inside an app and other apps on a user's device
Custom Keyboard (iOS): system-wide alternative keyboards

Source: iOS 8 Extensions: Apple’s Plan for a Powerful App Ecosystem

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

详解PyTorch中的ModuleList和Sequential

详解PyTorch中的ModuleList和Sequential

极市平台

0+阅读 · 2022年1月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

无人驾驶中4D场景实时解析算法研究

国家自然科学基金

12+阅读 · 2017年12月31日

基于线结构光的水下自主作业系统目标识别与定位方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

层级稀疏化的Mid-Level特征空间下高分辨率遥感影像检索方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于高分辨率遥感影像的城市社区尺度的收入水平估算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息融合核框架的多时相遥感影像特征级变化检测研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于压缩感知的稀疏阵列MIMO-SAR成像及动目标检测

国家自然科学基金

0+阅读 · 2012年12月31日

故障特征基于多源信息和约束条件的多尺度诊断方法

国家自然科学基金

0+阅读 · 2008年12月31日

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

Detect-and-describe: Joint learning framework for detection and description of objects

Arxiv

0+阅读 · 2022年4月19日

Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

Arxiv

10+阅读 · 2020年3月20日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

详解PyTorch中的ModuleList和Sequential

详解PyTorch中的ModuleList和Sequential

极市平台

0+阅读 · 2022年1月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

相关论文

Composite Anomaly Detection via Hierarchical Dynamic Search

Arxiv

0+阅读 · 2022年4月20日

Detect-and-describe: Joint learning framework for detection and description of objects

Arxiv

0+阅读 · 2022年4月19日

Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

Arxiv

10+阅读 · 2020年3月20日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

相关基金

无人驾驶中4D场景实时解析算法研究

国家自然科学基金

12+阅读 · 2017年12月31日

基于线结构光的水下自主作业系统目标识别与定位方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

层级稀疏化的Mid-Level特征空间下高分辨率遥感影像检索方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于高分辨率遥感影像的城市社区尺度的收入水平估算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于信息融合核框架的多时相遥感影像特征级变化检测研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于压缩感知的稀疏阵列MIMO-SAR成像及动目标检测

国家自然科学基金

0+阅读 · 2012年12月31日

故障特征基于多源信息和约束条件的多尺度诊断方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员