野生生物的视觉辅助声音源深度估计 (Visual-Assisted Sound Source Depth Estimation in the Wild) - 专知论文

会员服务 ·

0

估计/估计量 · 值域 · 虚拟现实（VR） · 推断 · 3D ·

2022 年 7 月 7 日

Visual-Assisted Sound Source Depth Estimation in the Wild

翻译：野生生物的视觉辅助声音源深度估计

Wei Sun,Lili Qiu

from arxiv, 13 pages;in submission;

Depth estimation enables a wide variety of 3D applications, such as robotics, autonomous driving, and virtual reality. Despite significant work in this area, it remains open how to enable accurate, low-cost, high-resolution, and large-range depth estimation. Inspired by the flash-to-bang phenomenon (\ie hearing the thunder after seeing the lightning), this paper develops FBDepth, the first audio-visual depth estimation framework. It takes the difference between the time-of-flight (ToF) of the light and the sound to infer the sound source depth. FBDepth is the first to incorporate video and audio with both semantic features and spatial hints for range estimation. It first aligns correspondence between the video track and audio track to locate the target object and target sound in a coarse granularity. Based on the observation of moving objects' trajectories, FBDepth proposes to estimate the intersection of optical flow before and after the sound production to locate video events in time. FBDepth feeds the estimated timestamp of the video event and the audio clip for the final depth estimation. We use a mobile phone to collect 3000+ video clips with 20 different objects at up to $50m$. FBDepth decreases the Absolute Relative error (AbsRel) by 55\% compared to RGB-based methods.

翻译：深度估测可以实现多种3D应用, 如机器人、自主驾驶和虚拟现实。尽管在这方面做了大量工作, 但它仍然可以允许准确、低成本、高分辨率和大范围的深度估测。受闪光到闪光现象的启发( 在看到闪电后听到雷雷声), 本文开发了第一个视听深度估测框架FBDepeh。它会考虑光线飞行时间( ToF) 和声音推导音源深度之间的差别。 FBDept是第一个将视频和音频包含语义特征和空间提示的视频和音频纳入范围估测的软件。它首先将视频音轨和音频轨之间的对应对齐, 以粗微的颗粒定位目标对象和目标声音。根据对移动物体轨迹的观察, FBDepteh 提议估算光流在声音制作前后的交错点, 以便及时定位视频事件。 FBBDepteh为视频事件的估计时间印本和最后深度估测距的音频剪。我们用55A 将移动手机的频率到RB, 将50 至RB 。我们用直径将50 级的底的频率递解到RB 。

0

相关内容

估计/估计量

估计/估计量

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

模糊和畸变场景图像中的文字识别研究

国家自然科学基金

1+阅读 · 2014年12月31日

GOAT/Ghrelin系统在断奶仔猪胃酸分泌中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

三维模型压缩感知与快速恢复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

pVHL磷酸化修饰及对其抑癌功能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

人源PCL家族蛋白参与表观遗传调控的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉感知的图像分割评价方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多目视觉的形体语言感知与识别研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于视差调整的3D视频重绘方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于视觉感知的嵌入式多视点视频编码方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

The concept of class invariant in object-oriented programming

Arxiv

0+阅读 · 2022年8月30日

Variance estimation in graphs with the fused lasso

Variance estimation in graphs with the fused lasso

Arxiv

0+阅读 · 2022年8月29日

Progression models for repeated measures: Estimating novel treatment effects in progressive diseases

Arxiv

0+阅读 · 2022年8月29日

Numerical geometric acoustics: an eikonal-based approach for modeling sound propagation in 3D environments

Arxiv

0+阅读 · 2022年8月27日

How to relate potential outcomes: Estimating individual treatment effects under a given specified partial correlation

Arxiv

0+阅读 · 2022年8月27日

Image Based Food Energy Estimation With Depth Domain Adaptation

Image Based Food Energy Estimation With Depth Domain Adaptation

Arxiv

0+阅读 · 2022年8月25日

Bridging the View Disparity of Radar and Camera Features for Multi-modal Fusion 3D Object Detection

Arxiv

0+阅读 · 2022年8月25日

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Arxiv

0+阅读 · 2022年8月25日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

估计/估计量

虚拟现实（VR）

相关VIP内容

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

The concept of class invariant in object-oriented programming

Arxiv

0+阅读 · 2022年8月30日

Variance estimation in graphs with the fused lasso

Variance estimation in graphs with the fused lasso

Arxiv

0+阅读 · 2022年8月29日

Progression models for repeated measures: Estimating novel treatment effects in progressive diseases

Arxiv

0+阅读 · 2022年8月29日

Numerical geometric acoustics: an eikonal-based approach for modeling sound propagation in 3D environments

Arxiv

0+阅读 · 2022年8月27日

How to relate potential outcomes: Estimating individual treatment effects under a given specified partial correlation

Arxiv

0+阅读 · 2022年8月27日

Image Based Food Energy Estimation With Depth Domain Adaptation

Image Based Food Energy Estimation With Depth Domain Adaptation

Arxiv

0+阅读 · 2022年8月25日

Bridging the View Disparity of Radar and Camera Features for Multi-modal Fusion 3D Object Detection

Arxiv

0+阅读 · 2022年8月25日

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Arxiv

0+阅读 · 2022年8月25日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

相关基金

模糊和畸变场景图像中的文字识别研究

国家自然科学基金

1+阅读 · 2014年12月31日

GOAT/Ghrelin系统在断奶仔猪胃酸分泌中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

三维模型压缩感知与快速恢复方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

pVHL磷酸化修饰及对其抑癌功能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

人源PCL家族蛋白参与表观遗传调控的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉感知的图像分割评价方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多目视觉的形体语言感知与识别研究

国家自然科学基金

2+阅读 · 2011年12月31日

基于视差调整的3D视频重绘方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

基于视觉感知的嵌入式多视点视频编码方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员