Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc. However, DNN-based ML applications also bring much increased computational and storage requirements, which are particularly challenging for embedded systems with limited compute/storage resources, tight power budgets, and small form factors. Challenges also come from the diverse application-specific requirements, including real-time responses, high-throughput performance, and reliable inference accuracy. To address these challenges, we introduce a series of effective design methodologies, including efficient ML model designs, customized hardware accelerator designs, and hardware/software co-design strategies to enable efficient ML applications on embedded systems.
翻译:深神经网络(DNN)在各种机器学习(ML)应用方面取得了巨大成功,在计算机视觉、自然语言处理和虚拟现实等方面提供了高质量的推论解决方案。 然而,基于DNNM的ML应用也带来了更多的计算和存储要求,对于计算/存储资源有限的嵌入系统、电力预算紧张和小规模因素而言,这些要求特别具有挑战性。挑战还来自各种应用程序的具体要求,包括实时反应、高通量性能和可靠的推论准确性。为了应对这些挑战,我们引入了一系列有效的设计方法,包括高效ML模型设计、定制硬件加速器设计、硬件/软件共同设计战略,以使嵌入系统能够高效的 ML应用。