L3DAS21 挑战:3D音频信号处理的机器学习 (L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing)

Eric Guizzo,Riccardo F. Gramaccioni,Saeid Jamili,Christian Marinoni,Edoardo Massaro,Claudia Medaglia,Giuseppe Nachira,Leonardo Nucciarelli,Ludovica Paglialunga,Marco Pennese,Sveva Pepe,Enrico Rocchi,Aurelio Uncini,Danilo Comminiello

from arxiv, Documentation paper for the L3DAS21 Challenge for IEEE MLSP 2021. Further information on www.l3das.com/mlsp2021

The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dual-mic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and SELDNet for SELD. This report is aimed at providing all needed information to participate in the L3DAS21 Challenge, illustrating the details of the L3DAS21 dataset, the challenge tasks and the baseline models.

翻译：L3DAS21挑战旨在鼓励和促进关于3D音频信号处理机器学习的合作研究,特别侧重于3D语音增强(SE)和3D音响定位和检测(SELD)。除了挑战外,我们还发布了L3DAS21数据集,即65小时3D音箱,并配有便利数据使用和结果提交阶段的Python API。通常,3D音工作的机器学习方法以单透视双侧侧侧侧侧音录音或单臂麦克风阵列为基础。我们提议以多源和多透视双侧侧音扩音和探测(SELD)为主的新型多频道音频配置录音。报告的目的是提供所有所需的挑战性数据、LDA3 基准数据、LDA3 基准数据、LDS3 基准数据、LDA3 基准数据、LDS3 基准数据、LDS3 任务、LDS3 基准数据、LDS 任务、LDS3 基准数据。

相关内容

Signal Processing

关注 3

信号处理期刊采用了理论与实践的各个方面的信号处理。它以原始研究工作，教程和评论文章以及实际发展情况为特色。它旨在将知识和经验快速传播给从事信号处理研究，开发或实际应用的工程师和科学家。该期刊涵盖的主题领域包括：信号理论；随机过程; 检测和估计；光谱分析；过滤；信号处理系统；软件开发；图像处理; 模式识别; 光信号处理；数字信号处理; 多维信号处理；通信信号处理；生物医学信号处理；地球物理和天体信号处理；地球资源信号处理；声音和振动信号处理；数据处理; 遥感; 信号处理技术；雷达信号处理；声纳信号处理；工业应用；新的应用程序。官网地址：http://dblp.uni-trier.de/db/journals/sigpro/

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

专知会员服务

87+阅读 · 2021年1月11日

专知会员服务

39+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【ACM Multimedia 2019 Tutorial】机器学习音频和多媒体数据的再现性和实验设计（Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data），Gerald Friedland