Learning representations of neural network weights given a model zoo is an emerging and challenging area with many potential applications from model inspection, to neural architecture search or knowledge distillation. Recently, an autoencoder trained on a model zoo was able to learn a hyper-representation, which captures intrinsic and extrinsic properties of the models in the zoo. In this work, we extend hyper-representations for generative use to sample new model weights as pre-training. We propose layer-wise loss normalization which we demonstrate is key to generate high-performing models and a sampling method based on the empirical density of hyper-representations. The models generated using our methods are diverse, performant and capable to outperform conventional baselines for transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.
翻译:模型动物园中神经网络重量的学习表现是一个新兴和具有挑战性的领域,有许多潜在应用,从模型检查到神经结构搜索或知识蒸馏。最近,在模型动物园中受过训练的自动编码器能够学习超强表示法,该表示法捕捉动物园中模型的内在和外部特性。在这项工作中,我们将超超表示法推广到作为培训前的样本的新模型重量的基因化使用。我们提出了从层到层的损耗正常化,这已证明是产生高性能模型和基于超代表法经验密度的抽样方法的关键。使用我们的方法生成的模型多种多样,具有性能,能够超越传统的学习基线。我们的结果表明,通过超表示法将知识从模型动物园到新模型的集合潜力,从而为新的研究方向铺平了道路。