Achieving faster execution with shorter compilation time can foster further diversity and innovation in neural networks. However, the current paradigm of executing neural networks either relies on hand-optimized libraries, traditional compilation heuristics, or very recently genetic algorithms and other stochastic methods. These methods suffer from frequent costly hardware measurements rendering them not only too time consuming but also suboptimal. As such, we devise a solution that can learn to quickly adapt to a previously unseen design space for code optimization, both accelerating the search and improving the output performance. This solution dubbed Chameleon leverages reinforcement learning whose solution takes fewer steps to converge, and develops an adaptive sampling algorithm that not only focuses on the costly samples (real hardware measurements) on representative points but also uses a domain-knowledge inspired logic to improve the samples itself. Experimentation with real hardware shows that Chameleon provides 4.45x speed up in optimization time over AutoTVM, while also improving inference time of the modern deep networks by 5.6%.
翻译:然而,目前实施神经网络的模式要么依赖于手动优化的图书馆、传统的编译超自然学,要么依靠最近的遗传算算法和其他随机方法。这些方法经常受到费用高昂的硬件测量的影响,不仅耗时太长,而且不够理想。因此,我们设计了一种解决方案,可以学习如何迅速适应先前看不见的设计空间,以优化代码,既加快搜索,又改进产出性能。这个解决方案被称为Cameleon, 利用强化学习,而其解决方案需要的整合步骤较少,开发适应性抽样算法,不仅侧重于代表性点的昂贵样本(实际硬件测量),而且还使用一种受域知识启发的逻辑来改进样本本身。 使用实际硬件的实验显示,Chameleon在AutTVM的优化时间里提供了4.45x的加速速度,同时将现代深层网络的推导时间提高5.6%。