Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning categories of approaches. As Deep Learning is gaining ground over classical machine learning and is beginning to be used in critical applied settings, it is important to assess systematically how well they respond to the kind of challenges their classical counterparts have struggled with in the past two decades. The purpose of this paper is to study the behavior of deep learning systems in settings that have previously been deemed challenging to classical machine learning systems to find out whether the depth of the systems is an asset in such settings. The results in both artificial and real-world image datasets (MNIST Fashion, CIFAR-10) show that these settings remain mostly challenging for Deep Learning systems and that deeper architectures seem to help with structural concept complexity but not with overlap challenges in simple artificial domains. Data scarcity is not overcome by deeper layers, either. In the real-world image domains, where overfitting is a greater concern than in the artificial domains, the advantage of deeper architectures is less obvious: while it is observed in certain cases, it is quickly cancelled as models get deeper and perform worse than their shallower counterparts.
翻译:结构性概念的复杂性、 阶级重叠和数据稀缺是影响分类者在阶级失衡条件下表现的一些最重要的因素。 当这些效应在2000年代初期被发现时,可以理解地说,这些效应所展示的分类者属于古典而不是深学习方法的类别。 随着深层学习在古典机器学习中越来越受重视,并开始在关键应用环境中使用,必须系统地评估他们如何应对其古典对应者在过去二十年中挣扎过的那种挑战。本文件的目的是研究深层学习系统在以前被认为对古典机器学习系统具有挑战性的环境中的行为。当这些效应被揭示出来时,他们所展示的分类者就属于古典而不是深层学习方法的类别。随着人工和现实世界图像数据集(MNIST Fashon,CIFAR-10)的出现结果,深层学习系统仍面临很大的挑战,更深层的建筑结构似乎有助于结构的复杂性,但不会在简单的人工领域遇到重叠的挑战。 数据稀缺性不是被更深层的。 在现实世界的图像域中,过度的考虑比在这类环境中更深层的更深层的考虑更深层模型更深层的优势是更差的。