探索虚拟化对利用深学习应用的影响 (Exploring the Impact of Virtualization on the Usability of the Deep Learning Applications)

Deep Learning-based (DL) applications are becoming increasingly popular and advancing at an unprecedented pace. While many research works are being undertaken to enhance Deep Neural Networks (DNN) -- the centerpiece of DL applications -- practical deployment challenges of these applications in the Cloud and Edge systems, and their impact on the usability of the applications have not been sufficiently investigated. In particular, the impact of deploying different virtualization platforms, offered by the Cloud and Edge, on the usability of DL applications (in terms of the End-to-End (E2E) inference time) has remained an open question. Importantly, resource elasticity (by means of scale-up), CPU pinning, and processor type (CPU vs GPU) configurations have shown to be influential on the virtualization overhead. Accordingly, the goal of this research is to study the impact of these potentially decisive deployment options on the E2E performance, thus, usability of the DL applications. To that end, we measure the impact of four popular execution platforms (namely, bare-metal, virtual machine (VM), container, and container in VM) on the E2E inference time of four types of DL applications, upon changing processor configuration (scale-up, CPU pinning) and processor types. This study reveals a set of interesting and sometimes counter-intuitive findings that can be used as best practices by Cloud solution architects to efficiently deploy DL applications in various systems. The notable finding is that the solution architects must be aware of the DL application characteristics, particularly, their pre- and post-processing requirements, to be able to optimally choose and configure an execution platform, determine the use of GPU, and decide the efficient scale-up range.

翻译：深学习应用程序正在变得越来越受欢迎,并以前所未有的速度向前推进。虽然目前正在开展许多研究工作,以加强深神经网络(DNN) -- -- DL应用程序的中心部分 -- -- 这些应用程序在云层和边缘系统中的实际部署挑战,以及其对应用程序可用性的影响,尚未充分调查,特别是云层和边缘提供的不同的虚拟化平台对DL应用程序使用率的影响(在端到端(E2E)的推算时间方面),仍然是一个尚未解决的问题。重要的是,资源弹性(通过升级工具)、CPU牵线和处理器类型(CPU vs GPU),这些应用在云和边缘系统中的实际部署挑战,以及其对应用程序使用率可能具有决定性作用的虚拟化平台对E2E的运行率的影响,因此,DL应用程序的可用性。为此,我们测量了四个大众执行平台(即光电、虚拟机器(VM)、集装箱和处理器类型(CPU)的清晰的配置特征,在四类的D-级的配置过程中,其最佳的部署流程和集装箱的配置,其最优化的运行流程必须由SU的运行到最优化的流程,其最高级的D-L的流程,其最高级的流程必须由在四类的运行、最高级的D型、最高级的流程中,其最高级的流程中,其最高级的运行的运行的流程中,其最高级的运行的流程和集装箱和集装箱和卡级的流程中,其最高级的运行式的运行的运行的运行式的流程,其排序的运行的流程必须由S-后演化的流程,其最精化的运行的运行式的运行的流程,其在VM的运行的流程的流程中,其最后演化的运行的运行的运行的运行的运行的流程的流程必须成为其最后的流程。