听力代表的全面评价 (HEAR: Holistic Evaluation of Audio Representations)

Joseph Turian,Jordie Shier,Humair Raj Khan,Bhiksha Raj,Björn W. Schuller,Christian J. Steinmetz,Colin Malloy,George Tzanetakis,Gissel Velarde,Kirk McNally,Max Henry,Nicolas Pinto,Camille Noufi,Christian Clough,Dorien Herremans,Eduardo Fonseca,Jesse Engel,Justin Salamon,Philippe Esling,Pranay Manocha,Shinji Watanabe,Zeyu Jin,Yonatan Bisk

from arxiv, to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, including speech, environmental sound, and music. HEAR was launched as a NeurIPS 2021 shared challenge. In the spirit of shared exchange, each participant submitted an audio embedding model following a common API that is general-purpose, open-source, and freely available to use. Twenty-nine models by thirteen external teams were evaluated on nineteen diverse downstream tasks derived from sixteen datasets. Open evaluation code, submitted models and datasets are key contributions, enabling comprehensive and reproducible evaluation, as well as previously impossible longitudinal studies. It still remains an open question whether one single general-purpose audio representation can perform as holistically as the human ear.

翻译：何种音频嵌入方法在不作微调的情况下,将哪些音频嵌入方法最能概括到各种日常领域的广泛下游任务? 听力基准的目的是开发一个通用的音频代表模式,为在各种各样的任务和情景下学习奠定坚实的基础。听力嵌入方法利用包括演讲、环境声音和音乐在内的各种领域的基准套件对音频表述进行评估。听力是作为NeurIPS 2021年的共同挑战而启动的。本着共同交流的精神,每个参与者都提交了一个音频嵌入模式,该模式遵循通用、开放源码和可自由使用的共同API。 13个外部小组的29个模式根据16套数据集产生的19项不同下游任务进行了评价。开放评价代码、提交的模型和数据集是关键贡献,有利于全面和可复制的评价,以及以前不可能进行的长纵向研究。单一的通用音频代表能否像人类耳一样整体地发挥作用,仍然是一个未决问题。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日