We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that includes a diverse set of candidate network structures. By using Taylor approximation, the optimal network structure in the neighborhood can be found with a greedy selection procedure. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures that avoid catastrophic forgetting in continual learning. Empirically, firefly descent achieves promising results on both neural architecture search and continual learning. In particular, on a challenging continual image classification task, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
翻译:我们提出“飞天神经结构下行”建议,这是逐步和动态增长神经网络的一个总体框架,可以共同优化网络参数和结构。我们的方法以最陡峭的下行方式运作,在原始网络的功能区段内反复找到最佳网络,其中包括一系列不同的候选网络结构。通过使用泰勒近似法,可以找到附近的最佳网络结构,而选择程序贪婪。我们表明,飞地下行可灵活地扩大网络,并可用于学习准确但资源效率高的神经结构,避免在持续学习中被灾难性地遗忘。在时间上,飞地下行在神经结构搜索和持续学习两方面都取得了令人振奋人心的成果。特别是,在一项具有挑战性的连续图像分类任务中,它学会了规模小但平均精度高于最先进方法所学的网络。