Deep learning, especially convolutional neural networks, has triggered accelerated advancements in computer vision, bringing changes into our daily practice. Furthermore, the standardized deep learning modules (also known as backbone networks), i.e., ResNet and EfficientNet, have enabled efficient and rapid development of new computer vision solutions. Yet, deep learning methods still suffer from several drawbacks. One of the most concerning problems is the high memory and computational cost, such that dedicated computing units, typically GPUs, have to be used for training and development. Therefore, in this paper, we propose a quantifiable evaluation method, the convolutional kernel redundancy measure, which is based on perceived image differences, for guiding the network structure simplification. When applying our method to the chest X-ray image classification problem with ResNet, our method can maintain the performance of the network and reduce the number of parameters from over $23$ million to approximately $128$ thousand (reducing $99.46\%$ of the parameters).
翻译:此外,标准化深层次学习模块(又称主干网网络),即ResNet和高效网络,使新的计算机愿景解决方案得到高效和快速的开发;然而,深层次学习方法仍有若干缺陷,其中最与问题有关的一个问题是记忆和计算成本高,因此,专门计算单位(通常是GPU)必须用于培训和开发。因此,我们在本文件中提议了一个量化的评估方法,即基于感知图像差异、指导网络结构简化的共生内核冗余措施。在对ResNet的胸部X射线图像分类问题应用我们的方法时,我们的方法可以保持网络的性能,并将参数从2 300万美元以上减少到约128万美元(参数减少99.46美元 )。