Radical progress in the field of deep learning (DL) has led to unprecedented accuracy in diverse inference tasks. As such, deploying DL models across mobile platforms is vital to enable the development and broad availability of the next-generation intelligent apps. Nevertheless, the wide and optimised deployment of DL models is currently hindered by the vast system heterogeneity of mobile devices, the varying computational cost of different DL models and the variability of performance needs across DL applications. This paper proposes OODIn, a framework for the optimised deployment of DL apps across heterogeneous mobile devices. OODIn comprises a novel DL-specific software architecture together with an analytical framework for modelling DL applications that: (1) counteract the variability in device resources and DL models by means of a highly parametrised multi-layer design; and (2) perform a principled optimisation of both model- and system-level parameters through a multi-objective formulation, designed for DL inference apps, in order to adapt the deployment to the user-specified performance requirements and device capabilities. Quantitative evaluation shows that the proposed framework consistently outperforms status-quo designs across heterogeneous devices and delivers up to 4.3x and 3.5x performance gain over highly optimised platform- and model-aware designs respectively, while effectively adapting execution to dynamic changes in resource availability.
翻译:深层次学习(DL)领域的激进进步导致不同推论任务中出现前所未有的准确性,因此,在移动平台中部署DL模型对于开发下一代智能应用程序和广泛提供下一代智能应用程序至关重要,然而,由于移动设备系统差异巨大,不同DL模型的计算成本不同,以及不同DL应用程序的性能需求差异,目前DL模型的广泛和优化部署受到以下因素的阻碍:移动设备的系统差异性、不同DL模型的计算成本不同,以及DL应用程序的性能需求差异。本文提议OOODIn,一个在不同移动设备中优化部署DL应用程序的框架。OODIn包含一个新的DL特定软件架构,以及一个模拟DL应用程序的分析框架,这些架构:(1) 通过高度相近的多层设计来应对设备设备和DL模型资源和DL模型的变异性;以及(2) 通过为DL推理应用程序设计的多目标设计,对模型和系统级参数进行有原则的优化,以便使部署适应用户指定的性能要求和装置能力。