As machine learning techniques become ubiquitous, the efficiency of neural network implementations is becoming correspondingly paramount. Frameworks, such as Halide and TVM, separate out the algorithmic representation of the network from the schedule that determines its implementation. Finding good schedules, however, remains extremely challenging. We model this scheduling problem as a sequence of optimization choices, and present a new technique to accurately predict the expected performance of a partial schedule. By leveraging these predictions we can make these optimization decisions greedily and rapidly identify an efficient schedule. This enables us to find schedules that improve the throughput of deep neural networks by 2.6x over Halide and 1.5x over TVM. Moreover, our technique is two to three orders of magnitude faster than that of these tools, and completes in seconds instead of hours.
翻译:随着机器学习技术变得无处不在,神经网络实施的效率也相应地变得至高无处。 Halide 和 TVM 等框架将网络的算法代表与决定其实施的时间表区分开来。然而,找到良好的时间表仍然极具挑战性。我们将这一时间安排问题作为优化选择的顺序来模拟,并展示了准确预测部分时间表预期性能的新技术。通过利用这些预测,我们可以贪婪地做出这些优化决定,并迅速确定一个有效的时间表。这使我们能够找到时间表,将深度神经网络的吞吐量比Halide高出2.6x,比TVM高出1.5x。此外,我们的技术比这些工具快2至3级,在几秒而不是几个小时内完成。