Classification performance based on ImageNet is the de-facto standard metric for CNN development. In this work we challenge the notion that CNN architecture design solely based on ImageNet leads to generally effective convolutional neural network (CNN) architectures that perform well on a diverse set of datasets and application domains. To this end, we investigate and ultimately improve ImageNet as a basis for deriving such architectures. We conduct an extensive empirical study for which we train $500$ CNN architectures, sampled from the broad AnyNetX design space, on ImageNet as well as $8$ additional well known image classification benchmark datasets from a diverse array of application domains. We observe that the performances of the architectures are highly dataset dependent. Some datasets even exhibit a negative error correlation with ImageNet across all architectures. We show how to significantly increase these correlations by utilizing ImageNet subsets restricted to fewer classes. These contributions can have a profound impact on the way we design future CNN architectures and help alleviate the tilt we see currently in our community with respect to over-reliance on one dataset.
翻译:基于图像网络的分类性能是CNN开发的脱facto标准标准度量。 在这项工作中,我们质疑以下概念:CNN架构设计完全以图像网络为基础,导致普遍有效的神经神经网络(CNN)架构,在一系列不同的数据集和应用领域运作良好。为此,我们调查并最终改进图像网络,以此作为生成这些架构的基础。我们开展了一项广泛的实证研究,为此,我们培训了500美元的CNN架构,这些架构来自广泛的AnyNetX设计空间,在图像网络上取样,以及来自多种应用领域的另外8美元众所周知的图像分类基准数据集。我们观察到,这些架构的性能高度依赖数据设置。有些数据集甚至展示出与所有架构的图像网络的负误差关系。我们展示了如何通过使用限制在更少的级别上使用图像网络子集来大幅增强这些关联性。这些贡献可以对我们设计未来CNN架构的方式产生深远影响,并有助于缓解我们社区目前看到的对一个数据集过度依赖的倾斜度。</s>