Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc. The high-performed DNNs heavily rely on intensive resource consumption. For example, training a DNN requires high dynamic memory, a large-scale dataset, and a large number of computations (a long training time); even inference with a DNN also demands a large amount of static storage, computations (a long inference time), and energy. Therefore, state-of-the-art DNNs are often deployed on a cloud server with a large number of super-computers, a high-bandwidth communication bus, a shared storage infrastructure, and a high power supplement. Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices. Compare to a cloud server, edge devices often have a rather small amount of resources. To deploy DNNs on edge devices, we need to reduce the size of DNNs, i.e., we target a better trade-off between resource consumption and model accuracy. In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems, and developed different methodologies to enable deep learning in each scenario. Since current DNNs are often over-parameterized, our goal is to find and reduce the redundancy of the DNNs in each scenario.
翻译:深神经网络(DNN)在许多不同的感知任务中取得了成功,例如计算机视觉、自然语言处理、强化学习等。 高性能的DNN严重依赖密集的资源消耗。例如,培训DNN需要高动态记忆、大型数据集和大量计算(培训时间长);即使与DNN的推论也需要大量静态存储、计算(长时间推导时间长)和能量。因此,最先进的DNNN常常被部署在一个云服务器上,拥有大量超级计算机、高宽度通信总线、共享存储基础设施以及高电源补充。最近,一些新出现的智能应用程序,如AR/VR、移动助理、信息网络等,要求我们将DNNN用于资源受限制的边缘装置。与云服务器相比,边缘装置往往有相当少量的资源。在边缘装置上部署DNNNN,我们需要降低D的尺寸、高宽度通信总和高频通信总和高电量的通讯总和高电压工具。我们每研究一个目标的智能模型、在ED的深度模型和电路的精度假设中,我们的目标交易系统,在EDED的精度模型中,我们每研究一次在ED的学习中,在ED的精度模型中,在ED的精度中,在ED。