Data augmentation is an inexpensive way to increase training data diversity and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss and that incorporating such measures into the loss function can improve the performance of audio classification systems. Put another way, we demonstrate how existing augmentation methods can further improve learning by enforcing consistency.
翻译:数据增强是增加培训数据多样性的廉价方式,通常通过转换现有数据来实现。关于分类等任务,有很好的理由可以学习如何表述这种变异性的数据,但分类损失(如交叉孔虫损失)并未明确加以执行。本文调查了明确施加这种一致性限制的培训目标的使用情况,以及它如何影响下游音频分类任务。在受监督环境中的深演神经网络中,我们从经验上表明,跨渗透性损失并不隐含某些一致性措施,将此类措施纳入损失功能可以改善音频分类系统的业绩。我们用另一种方式表明,现有的增强方法如何通过强制执行一致性来进一步改进学习。