We present Differentiable Neural Architectures (DNArch), a method that jointly learns the weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation. In particular, DNArch allows learning (i) the size of convolutional kernels at each layer, (ii) the number of channels at each layer, (iii) the position and values of downsampling layers, and (iv) the depth of the network. To this end, DNArch views neural architectures as continuous multidimensional entities, and uses learnable differentiable masks along each dimension to control their size. Unlike existing methods, DNArch is not limited to a predefined set of possible neural components, but instead it is able to discover entire CNN architectures across all combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds performant CNN architectures for several classification and dense prediction tasks on both sequential and image data. When combined with a loss term that considers the network complexity, DNArch finds powerful architectures that respect a predefined computational budget.
翻译:我们展示了不同的神经结构(DNARC),这是一种通过反向剖析共同学习进化神经网络(CNNs)的重量和结构的方法。特别是,DNAch允许学习(一) 每一层的进化内核大小,(二) 每一层的信道数量,(三) 下层取样层的位置和值,(四) 网络的深度。为此,DNAch将神经结构视为连续的多维实体,并在各个维度使用可学习的不同面罩来控制其大小。与现有方法不同,DNAch并不局限于一套预先定义的可能的神经元组成部分,而是能够发现整个CNN结构,它跨越了所有圆心内大小、宽度、深度和下层的组合。从表面看,DNAch发现CNN结构在连续和图像数据上的若干分类和密集的预测任务中都具有性能性。在考虑网络复杂性的损失术语时,DNAch发现符合预先确定的计算预算的强大结构。