We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC). Specifically, we tackle the ASC task in a low-resource environment leveraging a recently proposed advanced neural network pruning mechanism, namely Lottery Ticket Hypothesis (LTH), to find a sub-network neural model associated with a small amount non-zero model parameters. The effectiveness of LTH for low-complexity acoustic modeling is assessed by investigating various data augmentation and compression schemes, and we report an efficient joint framework for low-complexity multi-device ASC, called \emph{Acoustic Lottery}. Acoustic Lottery could compress an ASC model up to $1/10^{4}$ and attain a superior performance (validation accuracy of 79.4% and Log loss of 0.64) compared to its not compressed seed model. All results reported in this work are based on a joint effort of four groups, namely GT-USTC-UKE-Tencent, aiming to address the "Low-Complexity Acoustic Scene Classification (ASC) with Multiple Devices" in the DCASE 2021 Challenge Task 1a.
翻译:我们提出了一个新颖的神经模型压缩战略,其中结合了数据增强、知识转移、剪裁和测量设备-气压声学场景分类(ASC ) 。 具体地说,我们在一个低资源环境中处理ASC任务,利用最近提出的先进神经网络运行机制,即洛托里·库克特·希波西西(LTH),寻找与少量非零模式参数相关的亚网络神经模型。低兼容度声学模型模型的有效性是通过调查各种数据增强和压缩计划来评估的,我们报告了一个低兼容性多构件低兼容性多构件ASC(称为\emph{acoucistic Lottery})的高效联合框架。 声学彩票可以压缩ASC的模型,最高达1/10美元,并取得优异性性性(78.4%的校验精度和0.64的日志损失)与非压缩种子模型相比。 这项工作中报告的所有结果都基于四组的联合努力,即GT-UST-UK-ENentent,目的是与D-CAS Cal 20Sci-ScialScial(DAx) AS 20Scial) 的“ Cal-CAlizeal-C-C-C-C-Calationalationality”