This work explores how to design a single neural network capable of adapting to multiple heterogeneous vision tasks, such as image segmentation, 3D detection, and video recognition. This goal is challenging because both network architecture search (NAS) spaces and methods in different tasks are inconsistent. We solve this challenge from both sides. We first introduce a unified design space for multiple tasks and build a multitask NAS benchmark (NAS-Bench-MR) on many widely used datasets, including ImageNet, Cityscapes, KITTI, and HMDB51. We further propose Network Coding Propagation (NCP), which back-propagates gradients of neural predictors to directly update architecture codes along the desired gradient directions to solve various tasks. In this way, optimal architecture configurations can be found by NCP in our large search space in seconds. Unlike prior arts of NAS that typically focus on a single task, NCP has several unique benefits. (1) NCP transforms architecture optimization from data-driven to architecture-driven, enabling joint search an architecture among multitasks with different data distributions. (2) NCP learns from network codes but not original data, enabling it to update the architecture efficiently across datasets. (3) In addition to our NAS-Bench-MR, NCP performs well on other NAS benchmarks, such as NAS-Bench-201. (4) Thorough studies of NCP on inter-, cross-, and intra-tasks highlight the importance of cross-task neural architecture design, i.e., multitask neural architectures and architecture transferring between different tasks. Code is available at https://github.com/dingmyu/NCP.
翻译:这项工作探索如何设计一个能够适应多种不同愿景任务的单一神经网络, 如图像分割、 3D 检测和视频识别。 这个目标具有挑战性, 因为网络结构搜索( NAS) 空间和不同任务的方法都不一致。 我们从双方解决了这一挑战。 我们首先为多重任务引入一个统一的设计空间, 并在许多广泛使用的数据集上构建一个多任务NAS基准( NAS- Bench- MR), 包括图像网络、 城市景象、 KITTI 和 HMDBB51。 我们进一步提议网络 Coding Propagation (NCP), 将神经系统预测器的梯度直接更新结构代码, 沿期望的梯度方向解决各种任务 。 这样, NCP 可以在我们大型搜索空间里数秒里找到最佳的结构配置。 NCPS 与以往通常侧重于单项任务的艺术不同, NAS 有几种独特的好处 。 (1) NCP 将结构的优化从数据驱动到结构的跨结构, 将多任务转换为结构之间的联合搜索结构, NCP 2 NCP 在网络内部结构上学习, 而不是原始结构中进行原始数据系统内部的更新。