在不同的神经结构搜索中限制优化 (On Constrained Optimization in Differentiable Neural Architecture Search)

Differentiable Architecture Search (DARTS) is a recently proposed neural architecture search (NAS) method based on a differentiable relaxation. Due to its success, numerous variants analyzing and improving parts of the DARTS framework have recently been proposed. By considering the problem as a constrained bilevel optimization, we propose and analyze three improvements to architectural weight competition, update scheduling, and regularization towards discretization. First, we introduce a new approach to the activation of architecture weights, which prevents confounding competition within an edge and allows for fair comparison across edges to aid in discretization. Next, we propose a dynamic schedule based on per-minibatch network information to make architecture updates more informed. Finally, we consider two regularizations, based on proximity to discretization and the Alternating Directions Method of Multipliers (ADMM) algorithm, to promote early discretization. Our results show that this new activation scheme reduces final architecture size and the regularizations improve reliability in search results while maintaining comparable performance to state-of-the-art in NAS, especially when used with our new dynamic informed schedule.

翻译：差异式建筑搜索( DARSS) 是一种基于不同放松的近期拟议神经结构搜索(NAS) 方法。由于其成功, 最近提出了许多变量分析和改进 DARSS 框架部分内容的建议。通过将这一问题视为一个有限的双级优化, 我们提出并分析了建筑重量竞争的三项改进, 更新时间安排, 并实现离散。首先, 我们引入了启动建筑重量的新方法, 防止边缘内部竞争的混乱, 并允许对边缘进行公平比较, 以帮助分解。其次, 我们基于每微型网信息提出一个动态时间表, 以使架构更新更加知情。最后, 我们考虑两种基于离散和多动性方向方法( ADMM) 算法的规范化, 以促进早期离散。我们的结果显示, 新的启动计划降低了最终的架构规模, 并改进了搜索结果的可靠性, 同时保持与NAS 中的最新动态知情时间表相似的性能, 特别是在使用新动态知情时间表时。