KernelDNA：通过解耦的朴素适配器实现动态核共享 (KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters)

Dynamic convolution enhances model capacity by adaptively combining multiple kernels, yet faces critical trade-offs: prior works either (1) incur significant parameter overhead by scaling kernel numbers linearly, (2) compromise inference speed through complex kernel interactions, or (3) struggle to jointly optimize dynamic attention and static kernels. We observe that pre-trained Convolutional Neural Networks (CNNs) exhibit inter-layer redundancy akin to that in Large Language Models (LLMs). Specifically, dense convolutional layers can be efficiently replaced by derived "child" layers generated from a shared "parent" convolutional kernel through an adapter. To address these limitations and implement the weight-sharing mechanism, we propose a lightweight convolution kernel plug-in, named KernelDNA. It decouples kernel adaptation into input-dependent dynamic routing and pre-trained static modulation, ensuring both parameter efficiency and hardware-friendly inference. Unlike existing dynamic convolutions that expand parameters via multi-kernel ensembles, our method leverages cross-layer weight sharing and adapter-based modulation, enabling dynamic kernel specialization without altering the standard convolution structure. This design preserves the native computational efficiency of standard convolutions while enhancing representation power through input-adaptive kernel adjustments. Experiments on image classification and dense prediction tasks demonstrate that KernelDNA achieves a state-of-the-art accuracy-efficiency balance among dynamic convolution variants.

翻译：动态卷积通过自适应组合多个卷积核来增强模型容量，但面临关键权衡：先前的研究要么（1）因线性扩展卷积核数量而产生显著的参数开销，要么（2）通过复杂的卷积核交互损害推理速度，要么（3）难以同时优化动态注意力与静态卷积核。我们观察到，预训练的卷积神经网络（CNNs）表现出类似于大型语言模型（LLMs）的层间冗余。具体而言，密集的卷积层可以通过适配器从共享的“父”卷积核生成的衍生“子”层高效替代。为解决这些限制并实现权重共享机制，我们提出了一种轻量级卷积核插件，命名为KernelDNA。它将卷积核适应解耦为输入依赖的动态路由和预训练的静态调制，确保了参数效率和硬件友好的推理。与现有通过多卷积核集成扩展参数的动态卷积方法不同，我们的方法利用跨层权重共享和基于适配器的调制，在不改变标准卷积结构的情况下实现动态卷积核特化。该设计保留了标准卷积的固有计算效率，同时通过输入自适应的卷积核调整增强了表示能力。在图像分类和密集预测任务上的实验表明，KernelDNA在动态卷积变体中实现了最先进的精度-效率平衡。