We explored the use of reinforcement learning (RL) agents that can learn to perform neural network subgraph transformations, without the need of expertly designed heuristics to achieve a high level of performance. Reducing compute requirements of deep learning models is a focus of extensive research and many systems, optimisations and just-in-time (JIT) compilers have been proposed to decrease runtime. Recent work has aimed to apply reinforcement learning to computer systems with some success, especially using model-free RL techniques. Model-based reinforcement learning methods have seen an increased focus in research as they can be used to learn the transition dynamics of the environment; this can be leveraged to train an agent using the hallucinogenic environment, thereby increasing sample efficiency compared to model-free approaches. Furthermore, when using a world model as a simulated environment, batch rollouts can occur safely in parallel and, especially in systems environments, it overcomes the latency impact of updating system environments that can take orders of magnitude longer to perform an action compared to simple emulators for video games. We propose a design for a model-based agent which learns to optimise the architecture of neural networks by performing a sequence of subgraph transformations to reduce model runtime. We show our approach can match the performance of state of the art on common convolutional networks and outperform those by up to 5% on transformer-style architectures.
翻译:我们探索了如何使用强化学习(RL)代理器,这些代理器可以学习神经网络子图变,而不需要专业设计的超光速学来达到高水平的性能。降低深层学习模型的计算要求是广泛研究和许多系统的重点,建议优化和即时(JIT)编集者可以减少运行时间。最近的工作旨在将强化学习应用到计算机系统,取得一定的成功,特别是使用无模型的RL技术。基于模型的强化学习方法在研究中已经看到越来越多的重点,因为这些方法可以用来学习环境的转型动态;可以利用这些方法来利用幻觉环境来培训一个代理器,从而提高与无模型方法相比的样本效率。此外,如果使用世界模型作为模拟环境,则可以安全地同时进行批量滚动,特别是在系统环境中,这可以克服更新系统环境的静态影响,这种环境可以比简单的模拟游戏模拟器更大规模地采取行动。我们建议设计一个基于模型的转换代理器,以便学习如何选择常规的神经系统结构,我们可以通过运行模式的系统结构来减少常规的系统结构结构。我们可以通过运行式网络来显示常规的系统结构结构结构,可以通过测试系统结构来减少常规结构。