Learning representations of neural network weights given a model zoo is an emerging and challenging area with many potential applications from model inspection, to neural architecture search or knowledge distillation. Recently, an autoencoder trained on a model zoo was able to learn a hyper-representation, which captures intrinsic and extrinsic properties of the models in the zoo. In this work, we extend hyper-representations for generative use to sample new model weights. We propose layer-wise loss normalization which we demonstrate is key to generate high-performing models and several sampling methods based on the topology of hyper-representations. The models generated using our methods are diverse, performant and capable to outperform strong baselines as evaluated on several downstream tasks: initialization, ensemble sampling and transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.
翻译:模型动物园中神经网络重量的学习表现是一个新兴和具有挑战性的领域,有许多潜在应用,从模型检查到神经结构搜索或知识蒸馏。最近,在模型动物园中受过训练的自动编码器能够学习超强表示法,该表示法捕捉动物园中模型的内在和外部特性。在这项工作中,我们将超超表示法推广到基因化用途,用于样本新模型重量。我们提出从层到层的损耗正常化,这是产生高性能模型和基于超度表示法的地形学的若干抽样方法的关键。使用我们方法生成的模型多种多样,具有性能,能够超越一些下游任务(初始化、堆积采样和转移学习)所评估的强势基线。我们的结果表明,通过超表示法将知识从模型动物园到新模型,从而为新的研究方向铺平了道路。