Deep neural networks have become larger over the years with increasing demand of computational resources for inference; incurring exacerbate costs and leaving little room for deployment on devices with limited battery and other resources for real-time applications. The multi-exit architectures are type of deep neural network that are interleaved with several output (or exit) layers at varying depths of the model. They provide a sound approach for improving computational time and energy utilization of running a model through producing predictions from early exits. In this work, we present a novel and architecture-agnostic approach for robust training of multi-exit architectures termed consistent exit training. The crux of the method lies in a consistency-based objective to enforce prediction invariance over clean and perturbed inputs. We leverage weak supervision to align model output with consistency training and jointly optimize dual-losses in a multi-task learning fashion over the exits in a network. Our technique enables exit layers to generalize better when confronted with increasing uncertainty, hence, resulting in superior quality-efficiency trade-offs. We demonstrate through extensive evaluation on challenging learning tasks involving sensor data that our approach allows examples to exit earlier with better detection rate and without executing all the layers in a deep model.
翻译:多年来,深心神经网络变得更为庞大,因为对计算推断资源的需求不断增长;成本加剧,在电池有限装置和其他实时应用资源有限的装置上部署空间很少;多输出结构是深心神经网络的类型,在模型的不同深度与若干输出(或退出)层交织在一起;这些结构为通过从早期出口产生预测来改进运行模型的计算时间和能源利用提供了一个健全的方法;在这项工作中,我们提出了一个新颖的和建筑学的、不可知性的方法,用于对称为持续退出培训的多输出建筑进行强有力的培训;该方法的关键在于基于一致性的目标,即对清洁和扰动投入实施不变化预测;我们利用薄弱的监督,使模型产出与一致性培训保持一致,并共同优化多任务学习时对网络出口的双重损失;我们的技术使退出层在面对日益不确定因素时能够更全面地概括,从而导致更高的质量交易。我们通过广泛评价,展示了具有挑战性的学习模式的任务,涉及感官数据,我们的方法能够以更深层次的检测等级为范例,在不进行更深入的检测层次进行所有检查时,我们的方法可以以更早地展示。