以音频为基础的近复制视频检索与音频相似性学习 (Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning) - 专知论文

会员服务 ·

0

相似度 · 学成 · Networking · 稳健性 · 卷积神经网络 ·

2021 年 1 月 11 日

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

翻译：以音频为基础的近复制视频检索与音频相似性学习

Pavlos Avgoustinakis,Giorgos Kordopatis-Zilos,Symeon Papadopoulos,Andreas L. Symeonidis,Ioannis Kompatsiaris

In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutional Neural Network (CNN) trained on a large scale dataset of audio events, and then we calculate the similarity matrix derived from the pairwise similarity of these descriptors. The similarity matrix is subsequently fed to a CNN network that captures the temporal structures existing within its content. We train our network following a triplet generation process and optimizing the triplet loss function. To evaluate the effectiveness of the proposed approach, we have manually annotated two publicly available video datasets based on the audio duplicity between their videos. The proposed approach achieves very competitive results compared to three state-of-the-art methods. Also, unlike the competing methods, it is very robust to the retrieval of audio duplicates generated with speed transformations.

翻译：在这项工作中,我们处理基于音频的近复制视频检索问题。我们建议采用音频相似性学习(AuSiL)方法,有效捕捉视频配对之间的音频相似时间模式。为了对两个视频进行强有力的相似性计算,我们首先通过利用以大规模音频事件数据集培训的动态神经网络(CNN)为基础的传输学习,提取有代表性的音频视频描述符。然后我们计算出这些描述器的对称相似性矩阵。随后,将类似性矩阵输入一个有线电视新闻网网络,以捕捉其内容中存在的时间结构。我们培训我们的网络,遵循三重相生成过程,优化三重损失功能。为了评估拟议方法的有效性,我们根据视频视频的音频多彩度,手动了两个公开提供的视频数据集。拟议方法与三种最先进的方法相比,取得了非常有竞争力的结果。此外,与竞争性的方法不同,它对于以速度转换生成的音频重复进行检索非常可靠。

0

相关内容

相似度

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【Yoshua Bengio演讲NeurIPS2019报告】深度学习系统1代到2代，36页ppt，From System 1 Deep Learning to System 2 Deep Learning

【Yoshua Bengio演讲NeurIPS2019报告】深度学习系统1代到2代，36页ppt，From System 1 Deep Learning to System 2 Deep Learning

专知会员服务

106+阅读 · 2019年12月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

行人再识别中的迁移学习：图像风格转换（Learning via Translation）

行人再识别中的迁移学习：图像风格转换（Learning via Translation）

极市平台

3+阅读 · 2017年11月30日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Predicting Visual Features from Text for Image and Video Caption Retrieval

Arxiv

5+阅读 · 2018年7月14日

Fine-tuning CNN Image Retrieval with No Human Annotation

Fine-tuning CNN Image Retrieval with No Human Annotation

Arxiv

4+阅读 · 2018年7月10日

Video Summarisation by Classification with Deep Reinforcement Learning

Video Summarisation by Classification with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年7月9日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

Arxiv

6+阅读 · 2018年4月9日

Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval

Arxiv

8+阅读 · 2018年3月5日

Textually Customized Video Summaries

Arxiv

4+阅读 · 2018年3月1日

VIP会员

文章信息

相关主题

卷积神经网络

相关VIP内容

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

用于大型遥感影像检索的深度学习，Deep Learning for Image Search and Retrieval in Large Remote Sensing Archives

专知会员服务

39+阅读 · 2020年4月6日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

【Yoshua Bengio演讲NeurIPS2019报告】深度学习系统1代到2代，36页ppt，From System 1 Deep Learning to System 2 Deep Learning

【Yoshua Bengio演讲NeurIPS2019报告】深度学习系统1代到2代，36页ppt，From System 1 Deep Learning to System 2 Deep Learning

专知会员服务

106+阅读 · 2019年12月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

网络安全技术生成式人工智能服务安全基本要求

【博士论文】面向下游任务的语言模型优化：一种后训练视角

【新书】AI红队演练：智能系统的攻击与防御

基于 Transformer 的脑电解码综述询问 ChatGPT

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

行人再识别中的迁移学习：图像风格转换（Learning via Translation）

行人再识别中的迁移学习：图像风格转换（Learning via Translation）

极市平台

3+阅读 · 2017年11月30日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Automatic Face Aging in Videos via Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年11月27日

Predicting Visual Features from Text for Image and Video Caption Retrieval

Arxiv

5+阅读 · 2018年7月14日

Fine-tuning CNN Image Retrieval with No Human Annotation

Fine-tuning CNN Image Retrieval with No Human Annotation

Arxiv

4+阅读 · 2018年7月10日

Video Summarisation by Classification with Deep Reinforcement Learning

Video Summarisation by Classification with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年7月9日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

Arxiv

6+阅读 · 2018年4月9日

Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval

Arxiv

8+阅读 · 2018年3月5日

Textually Customized Video Summaries

Arxiv

4+阅读 · 2018年3月1日

微信扫码咨询专知VIP会员