Deep learning researchers and practitioners usually leverage GPUs to help train their deep neural networks (DNNs) faster. However, choosing which GPU to use is challenging both because (i) there are many options, and (ii) users grapple with competing concerns: maximizing compute performance while minimizing costs. In this work, we present a new practical technique to help users make informed and cost-efficient GPU selections: make performance predictions with the help of a GPU that the user already has. Our technique exploits the observation that, because DNN training consists of repetitive compute steps, predicting the execution time of a single iteration is usually enough to characterize the performance of an entire training process. We make predictions by scaling the execution time of each operation in a training iteration from one GPU to another using either (i) wave scaling, a technique based on a GPU's execution model, or (ii) pre-trained multilayer perceptrons. We implement our technique into a Python library called Habitat and find that it makes accurate iteration execution time predictions (with an average error of 11.8%) on ResNet-50, Inception v3, the Transformer, GNMT, and DCGAN across six different GPU architectures. Habitat supports PyTorch, is easy to use, and is open source.
翻译:深层学习研究人员和从业者通常会利用GPU更快地利用GPU来帮助训练深层神经网络。然而,选择哪个GPU来帮助训练其深层神经网络(DNN),这既具有挑战性,因为(一) 有许多选项,也(二) 用户在相互竞争的关切问题下努力: 最大限度地计算性能,同时尽量减少成本。在这项工作中,我们提出了一个新的实用技术,以帮助用户在知情和具有成本效益的情况下选择GPU: 在用户已经拥有的GPU的帮助下进行业绩预测。我们的技术利用这一观察,因为DNNN培训包括重复的计算步骤,预测一个单一循环的执行时间通常足以描述整个培训过程的性能。我们通过将每次操作的执行时间从一个GPU到另一个GPU的训练时间缩放来进行预测。我们把我们的技术应用到一个开放的Python图书馆,我们发现它能够准确预测出执行时间(平均误差11.8 % ) 。 在 ResNet-GNBS-50 上, 支持一个容易的GMERM 和G-G- 结构。