The conventional spatial convolution layers in the Convolutional Neural Networks (CNNs) are computationally expensive at the point where the training time could take days unless the number of layers, the number of training images or the size of the training images are reduced. The image size of 256x256 pixels is commonly used for most of the applications of CNN, but this image size is too small for applications like Diabetic Retinopathy (DR) classification where the image details are important for accurate classification. This research proposed Frequency Domain Convolution (FDC) and Frequency Domain Pooling (FDP) layers which were built with RFFT, kernel initialization strategy, convolution artifact removal and Channel Independent Convolution (CIC) to replace the conventional convolution and pooling layers. The FDC and FDP layers are used to build a Frequency Domain Convolutional Neural Network (FDCNN) to accelerate the training of large images for DR classification. The Full FDC layer is an extension of the FDC layer to allow direct use in conventional CNNs, it is also used to modify the VGG16 architecture. FDCNN is shown to be at least 54.21% faster and 70.74% more memory efficient compared to an equivalent CNN architecture. The modified VGG16 architecture with Full FDC layer is reported to achieve a shorter training time and a higher accuracy at 95.63% compared to the original VGG16 architecture for DR classification.
翻译:在培训时间可能需要数天,除非降低层次数量、培训图像的数量或培训图像的大小,否则,常规神经神经网络(CNN)中的常规空间共变层是计算成本昂贵的。CNN的大部分应用通常使用256x256像素的图像大小,但这种图像大小太小,无法用于诸如糖尿病雷蒂诺病(DR)分类等应用,因为其图像细节对于准确分类很重要。这项研究提议频域共变(FDC)和频域共聚(FDP)层,这些层与RFFFFFT、内核初始化战略、聚合文物清除和频道独立革命(CIC)一起建立,以取代常规的卷团和集合层。FDC和FDP层的图像大小通常用于建立频率DRVAVCNN(D)神经网络(DCNN),以加速对大型图像进行DR分类的培训。FDC的完整层是FDC层的延伸,以便直接用于常规CNNM的分类,同时也用于修改VGFG16结构的比VG16和FGDN(FDN)的更高级结构。在比VDMDNDNDN(VDN)中,显示最先进的结构中,比VGGGGGGFDNDNDNDNNND(比快的10)和最高级结构结构,显示更快的比快。