Designing deep learning-based solutions is becoming a race for training deeper models with a greater number of layers. While a large-size deeper model could provide competitive accuracy, it creates a lot of logistical challenges and unreasonable resource requirements during development and deployment. This has been one of the key reasons for deep learning models not being excessively used in various production environments, especially in edge devices. There is an immediate requirement for optimizing and compressing these deep learning models, to enable on-device intelligence. In this research, we introduce a black-box framework, Deeplite Neutrino for production-ready optimization of deep learning models. The framework provides an easy mechanism for the end-users to provide constraints such as a tolerable drop in accuracy or target size of the optimized models, to guide the whole optimization process. The framework is easy to include in an existing production pipeline and is available as a Python Package, supporting PyTorch and Tensorflow libraries. The optimization performance of the framework is shown across multiple benchmark datasets and popular deep learning models. Further, the framework is currently used in production and the results and testimonials from several clients are summarized.
翻译:设计深层次的学习解决方案正在成为培养更深层次模型的竞赛。虽然大型更深层次模型可以提供竞争准确性,但它在开发和部署期间造成了许多后勤挑战和不合理的资源需求。这是深层次模型在各种生产环境中,特别是在边缘装置中未被过度使用的主要原因之一。立即需要优化和压缩这些深层次的学习模型,以便能够在设备上获得情报。在这项研究中,我们引入了一个黑盒框架,即Deplite Neutrino,用于为深层次学习模型的制作做好准备。该框架为终端用户提供了一个方便的机制,以提供各种限制,如优化模型的准确性或目标大小的可承受性下降,指导整个优化进程。该框架很容易纳入现有的生产管道,并作为Python软件包提供,支持PyTorrch和Tensorflow图书馆。框架的优化性表现体现在多个基准数据集和广受欢迎的深层次学习模型中。此外,目前还在生产中使用的框架以及若干客户的测试结果和测试性模型。