Collecting sufficient amount of data that can represent various acoustic environmental attributes is a critical problem for distributed acoustic machine learning. Several audio data augmentation techniques have been introduced to address this problem but they tend to remain in simple manipulation of existing data and are insufficient to cover the variability of the environments. We propose a method to extend a technique that has been used for transferring acoustic style textures between audio data. The method transfers audio signatures between environments for distributed acoustic data augmentation. This paper devises metrics to evaluate the generated acoustic data, based on classification accuracy and content preservation. A series of experiments were conducted using UrbanSound8K dataset and the results show that the proposed method generates better audio data with transferred environmental features while preserving content features.
翻译:收集能够代表各种声学环境特性的足够数据是分布式声学机器学习的关键问题。为解决这一问题,采用了几种扩增音频数据技术,但这些技术往往仅对现有数据进行简单操作,不足以覆盖环境的变异性。我们建议了一种方法,以扩大用于音频数据之间传音风格纹理的技术。方法将声学信号传输到分布式声学数据增强的环境之间。本文根据分类准确性和内容保护设计了评估产生的声学数据的指标。利用UrbanSound8K数据集进行了一系列实验,结果显示,拟议的方法在保存内容特征的同时,还产生更好的音频数据,并附有转移式环境特征。