Deep neural networks (DNNs) are so over-parametrized that recent research has found them to already contain a subnetwork with high accuracy at their randomly initialized state. Finding these subnetworks is a viable alternative training method to weight learning. In parallel, another line of work has hypothesized that deep residual networks (ResNets) are trying to approximate the behaviour of shallow recurrent neural networks (RNNs) and has proposed a way for compressing them into recurrent models. This paper proposes blending these lines of research into a highly compressed yet accurate model: Hidden-Fold Networks (HFNs). By first folding ResNet into a recurrent structure and then searching for an accurate subnetwork hidden within the randomly initialized model, a high-performing yet tiny HFN is obtained without ever updating the weights. As a result, HFN achieves equivalent performance to ResNet50 on CIFAR100 while occupying 38.5x less memory, and similar performance to ResNet34 on ImageNet with a memory size 26.8x smaller. The HFN will become even more attractive by minimizing data transfers while staying accurate when it runs on highly-quantized and randomly-weighted DNN inference accelerators. Code available at https://github.com/Lopez-Angel/hidden-fold-networks
翻译:深心神经网络(DNNS)是如此的过度不对称,以至于最近的研究发现它们已经包含一个高度精密而又准确的子网络,在随机初始状态下,它们已经含有一个高度精密的子网络。找到这些子网络是用于体重学习的一种可行的替代培训方法。与此同时,另一行工作假设深残余网络(ResNets)正在试图接近浅层经常性神经网络(RNNS)的行为,并提出了将它们压缩成经常性模型的方法。本文建议将这些研究线并入一个高度压缩但准确的模型:隐藏式网络(HFNS)。首先将ResNet(HWN)折叠成一个经常性结构,然后寻找一个隐藏在随机初始模型中的一种准确的子网络。在随机初始化模型中,获得一个高性但微小的HNFNFER(H),而没有更新其重量。结果,HFNFN在CIFAR100上取得了相当于ResNet50的功能,同时占用了38.5x的记忆力,以及图像网络上ResNet34的类似性功能,其记忆大小为26.8x。HNFNFN将变得更加有吸引力,因为它在高度重量级/天使级数据库运行中保持准确的数据传输时,同时保持数据传输/MUDM-quququcrecidustrational-commus-commus-commations。