Data augmentation is an essential technique for improving recognition accuracy in object recognition using deep learning. Methods that generate mixed data from multiple data sets, such as mixup, can acquire new diversity that is not included in the training data, and thus contribute significantly to accuracy improvement. However, since the data selected for mixing are randomly sampled throughout the training process, there are cases where appropriate classes or data are not selected. In this study, we propose a data augmentation method that calculates the distance between classes based on class probabilities and can select data from suitable classes to be mixed in the training process. Mixture data is dynamically adjusted according to the training trend of each class to facilitate training. The proposed method is applied in combination with conventional methods for generating mixed data. Evaluation experiments show that the proposed method improves recognition performance on general and long-tailed image recognition datasets.
翻译:数据增强是利用深层学习提高物体识别识别准确度的一种必要技术。从多个数据集生成混合数据的方法,如混合,可以获取培训数据中未包含的新的多样性,从而极大地提高准确性。然而,由于选择混合的数据在整个培训过程中是随机抽样的,因此有些情况下没有选择适当的类别或数据。在本研究中,我们建议了一种数据增强方法,根据等级概率计算各等级之间的距离,并可以选择适当类别的数据,在培训过程中加以混合。混合数据根据每个班的培训趋势进行动态调整,以便利培训。拟议方法与生成混合数据的常规方法相结合使用。评价实验表明,拟议方法提高了一般和长尾图像识别数据集的识别性能。