Despite their immense success, deep convolutional neural networks (CNNs) can be difficult to optimize and costly to train due to hundreds of layers within the network depth. Conventional convolutional operations are fundamentally limited by their linear nature along with fixed activations, where many layers are needed to learn meaningful patterns in data. Because of the sheer size of these networks, this approach is simply computationally inefficient, and poses overfitting or gradient explosion risks, especially in small datasets. As a result, we introduce a "plug-in" module, called Residual Kolmogorov-Arnold Network (RKAN). Our module is highly compact, so it can be easily added into any stage (level) of traditional deep networks, where it learns to integrate supportive polynomial feature transformations to existing convolutional frameworks. RKAN offers consistent improvements over baseline models in different vision tasks and widely tested benchmarks, accomplishing cutting-edge performance on them.
翻译:尽管深度卷积神经网络(CNN)取得了巨大成功,但由于网络深度包含数百层,其优化难度大且训练成本高昂。传统卷积操作本质上受限于其线性特性及固定的激活函数,往往需要大量层数以学习数据中的有效模式。由于此类网络的庞大规模,该方法的计算效率低下,并存在过拟合或梯度爆炸的风险,尤其在小型数据集上更为突出。为此,我们提出了一种称为“残差Kolmogorov-Arnold网络”(RKAN)的“即插即用”模块。该模块高度紧凑,可轻松嵌入传统深度网络的任意阶段(层级),通过学习将支持性多项式特征变换集成到现有卷积框架中。RKAN在不同视觉任务及广泛测试的基准数据集上均对基线模型实现了稳定改进,并取得了最先进的性能表现。