Existing optical flow estimators usually employ the network architectures typically designed for image classification as the encoder to extract per-pixel features. However, due to the natural difference between the tasks, the architectures designed for image classification may be sub-optimal for flow estimation. To address this issue, we propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. We first design a suitable search space including various convolutional operators and construct a weight-sharing super-network for efficiently evaluating the candidate architectures. Then, for better training the super-network, we propose Feature Alignment Distillation, which utilizes a well-trained flow estimator to guide the training of super-network. Finally, a resource-constrained evolutionary algorithm is exploited to find an optimal architecture (i.e., sub-network). Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67\% F1-all error on KITTI, an 8.4\% reduction of RAFT baseline, surpassing state-of-the-art handcrafted models GMA and AGFlow, while reducing the model complexity and latency. The source code and trained models will be released in https://github.com/VDIGPKU/FlowNAS.
翻译:现有的光流估计器通常使用通常设计为图像分类设计的网络结构,作为提取每像素特征的编码器。然而,由于任务之间的自然差异,为图像分类设计的结构可能亚优于流量估计。为解决这一问题,我们提议了名为FlowNAS的神经结构搜索方法,以自动找到用于流量估计任务的更好的编码器结构。我们首先设计一个适当的搜索空间,包括各种革命操作员,并建立一个用于高效评估候选结构的权重共享超级网络。然后,为了更好地培训超级网络,我们建议“功能调整蒸馏”,利用训练有素的流动估计器指导超级网络的培训。最后,我们利用资源限制的进化算法寻找一个最佳的架构(例如,子网络)。实验结果表明,从超级网络继承的权重所发现的架构在KITTI上实现了4.67 ⁇ F1-全误差错误。为了更好地培训超级网络,我们提出了“功能调整”蒸馏法,利用经过良好训练的流动估计仪算器指导超级网络的培训。最后,将利用资源限制的进流测算法计算器来寻找最佳的架构(例如,子网络)MA/GVIGPTI/释放源。