Convolutional Neural Networks (CNNs) such as ResNet-50, DenseNet-40 and ResNeXt-56 are severely over-parameterized, necessitating a consequent increase in the computational resources required for model training which scales exponentially for increments in model depth. In this paper, we propose an Entropy-Based Convolutional Layer Estimation (EBCLE) heuristic which is robust and simple, yet effective in resolving the problem of over-parameterization with regards to network depth of CNN model. The EBCLE heuristic employs a priori knowledge of the entropic data distribution of input datasets to determine an upper bound for convolutional network depth, beyond which identity transformations are prevalent offering insignificant contributions for enhancing model performance. Restricting depth redundancies by forcing feature compression and abstraction restricts over-parameterization while decreasing training time by 24.99% - 78.59% without degradation in model performance. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE heuristic, which maintains or outperforms baseline classification accuracies of narrower yet deeper models. The EBCLE heuristic is architecturally agnostic and EBCLE based CNN models restrict depth redundancies resulting in enhanced utilization of the available computational resources. The proposed EBCLE heuristic is a compelling technique for researchers to analytically justify their HyperParameter (HP) choices for CNNs. Empirical validation of the EBCLE heuristic in training CNN models was established on five benchmarking datasets (ImageNet32, CIFAR-10/100, STL-10, MNIST) and four network architectures (DenseNet, ResNet, ResNeXt and EfficientNet B0-B2) with appropriate statistical tests employed to infer any conclusive claims presented in this paper.
翻译:RESNet-50、DenseNet-40和ResNeXt-56等革命神经网络(CNN)严重超标,因此需要随之增加模型培训所需的计算资源,而模型深度的增量则要以指数为指数。在本文中,我们建议采用基于环境的革命层层模拟(EBCLE),它既有力又简单,但能有效解决与CNN的网络选择深度有关的超标化问题。EBCLE Heuristem利用了输入数据集的配置数据分配的先验性知识,以确定脉冲网络深度的上限,超出这一范围,身份转换为增强模型性能提供了微不足道的贡献。通过强制特征压缩和抽象化来限制超标度的深度,同时将培训时间减少24.99%至78.59%,而不会降低模型性能。我们提出了经验证据,用EBCLEEE的超值培训模型和较浅的模型,从而维持或超过CREBEVO的分析性深度分析模型,从而使得ELEAR-C的更深层次的深度数据分类。