Deep learning researchers and practitioners usually leverage GPUs to help train their deep neural networks (DNNs) faster. However, choosing which GPU to use is challenging both because (i) there are many options, and (ii) users grapple with competing concerns: maximizing compute performance while minimizing costs. In this work, we present a new practical technique to help users make informed and cost-efficient GPU selections: make performance predictions using the help of a GPU that the user already has. Our technique exploits the observation that, because DNN training consists of repetitive compute steps, predicting the execution time of a single iteration is usually enough to characterize the performance of an entire training process. We make predictions by scaling the execution time of each operation in a training iteration from one GPU to another using either (i) wave scaling, a technique based on a GPU's execution model, or (ii) pre-trained multilayer perceptrons. We implement our technique into a Python library called Surfer and find that it makes accurate iteration execution time predictions on ResNet-50, Inception v3, the Transformer, GNMT, and DCGAN across six different GPU architectures. Surfer currently supports PyTorch, is easy to use, and requires only a few lines of code.
翻译:深层学习研究人员和从业者通常会利用GPU更快地利用GPU来帮助训练其深层神经网络。然而,选择哪个GPU来帮助训练其深层神经网络(DNNS),这具有挑战性,因为(一) 有许多选项,以及(二) 用户在相互竞争的关切问题下努力:最大限度地计算性能,同时尽量减少成本。在这项工作中,我们提出了一个新的实用技术来帮助用户作出知情和具有成本效益的GPU选择:利用用户已经拥有的GPU来帮助进行业绩预测。我们的技术利用这一观察,因为DNNN培训包括重复的计算步骤,预测一个单一循环的执行时间通常足以描述整个培训过程的性能。我们通过将每个操作的执行时间从一个GPU到另一个G的训练时间从一个GPU(i)到另一个GN(iNet-50,Incepion V3),我们把我们的技术应用到一个PHER(PH)图书馆,我们发现它只对ResNet-50、Incepion 和GMT(DG) 的六G) 需要一种不同的GVDVA、现在和GNS(GV) 需要一个不同的GV) 和GVI(DR) 支持一个不同的结构。